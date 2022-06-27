It is amazing what capabilities the human eye possesses, especially when it comes to perceiving depth in 2D images. However, making computers capable of replicating this feature has been a real challenge.

The reason behind this is that much of the information present in a 3d scene is lost as it is transferred to 2D format, making it difficult for a computer recognize and process each element under this dimension.

[mb_related_posts1]

In fact, although there are methods that can be useful to process a series of 2D images to generate 3D information, they have some limitations that make this process difficult.

However, recently the creation of something called virtual correspondence by a group of researchers from MIT, with which they hope to be able to correct the deficiencies present in this process to obtain a greater margin of success compared to the traditional methodology.

To do this, they have implemented the principles present within something called “structure from movement”.

An example to understand this concept would be the one where we take two images of an object: one from the left side and the other from the right. Then both images are contrasted in order to find points or pixels in common. Based on this information, a researcher could establish the angle from which each camera took the picture, as well as the direction in which they were pointed.

In this way it would be possible later to carry out a triangulationwhere the distance to a specific point of the captured object in the image is calculated.

[mb_related_posts2]

However, for the approach to be effective, the level of overlap must be large, since if there is few common points between the images taken, the system fails.

Therefore, the structure based on movement requires that two images have points in common in order to establish a triangle that allows the cameras to be connected to the common point and thus determine the depth.

How the new system works

In the case of virtual correspondence, this goes one step further. If, for example, a photo of a cat is taken from the left side and another from the right side, it is possible that a spot on the front left leg is noticeable in the first photo.

Taking into account that the light is projected in a straight line, you could use your general knowledge of the cat’s anatomy to determine the point where the light beam coming from the camera would exit towards the paw on the other side of the cat.

In case that point was visible in the photo taken from the right side, then triangulation could be used to calculate distances in the third dimension.

Ultimately, the team involved in this research hopes to get computers to interpret the three-dimensional world the same way humans do.

For this, it will be necessary to build systems that are not only capable of interpreting still images, but also that can understand video clips and entire films.