Scientists from the Massachusetts Institute of Technology, MIT, and Adobe Research have developed a new technique that can identify all pixels in an image representing a given material, which is shown in a pixel selected by the user. The scientists use the machine-learning method, AI to identify similar materials in images with robotic scene understanding, image editing, and online recommendation systems.
Prafull Sharma, an electrical engineering and computer science graduate student from MIT, is the lead author of this technique. Other authors include Julien Philip and Michael Gharbi, research scientists at Adobe Research; William T. Freeman, the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science, Frédo Durand, a professor of electrical engineering and computer Science and a research scientist at Adobe Research.
Using AI to Identify Similar Materials in Images
Presently, the existing methods for material selection have difficulty identifying all pixels that belong to the same material with accuracy. A chair with wooden arms and a leather seat is an example of an object that can be made of several materials, even though some approaches focus on full objects. Other methods might make use of a preset set of materials, but they frequently have general names like wood, even though there are thousands of different kinds of wood.
Sharma and his colleagues instead created a machine-learning method that dynamically assesses each pixel in a picture to estimate the degree of similarity between a user-selected pixel and every other area of the image. Their model can successfully recognize regions that are similar to each other, such as the tabletop and chair legs in an image of a table and two chairs.
The method is accurate even when objects have varying shapes and sizes, and the machine-learning model they created is unaffected by shadows or poor lighting, which can make the same material appear to be different.
The system works well on genuine indoor and outdoor situations that it has never seen before, even though it trained its model using only synthetic data, which are produced by a computer by altering 3D scenes to make many different images. The method may be used for films as well; if a user recognizes a pixel in the first frame, the model can recognize things formed of the same material throughout the remainder of the video.
Additionally, this technique could be used for picture editing or integrated into computer systems that determine the properties of materials in photographs, apart from the two applications in scene interpretation for robots. It might also be used for web recommendation systems based on content. The research findings are said to be presented at the SIGGRAPH 2023 conference.