The development of deep learning has led to numerous advances in the field of computer vision. Many companies are creating applications that allow their products to be visualised using a smartphone. With the aim of creating an application for visualising interior doors, this thesis tests different deep-learning models that can in the first instance detect and segment doors in an image, and in the second instance replace these doors with the actual background that lies behind them. To do this, different models based on the concepts of convolutional neural networks and vision transformers are compared according to different criteria and metrics. The results indicate that the best way of segmenting a door is to use an open-set object-detection model that does not require additional training, such as Grounding-DINO, combined with a state-of-the-art image segmenter such as Segment Anything. For door replacement in an image, the suggested solution is to merge an image of the open door with an image of the same closed door based on the segmentation results. However, the two images must be aligned, which can be achieved by using a keypoint detection model such as YOLOv8 to calculate the homography of the images and therefore align them.
Detecting and Removing Objects from Pictures: the Case of Interior Doors
MIGLIONICO, M. (Auteur). 20 juin 2024
Student thesis: Master types › Master en sciences informatiques à finalité spécialisée en Software engineering