Week10 : Progress and Plan


Object detection is a fundamental function in our project, therefore we worked on implementing the parts. We tested our model to detect the Japanese road sign to confirm that our model successfully detects the object. For training the model, it took 8 hours without GPU. We used SSD mobilenet and found out that our model successfully finds not only the object in the image but also streaming video which comes through the webcam.  Below image shows that our model successfully recognized the 4 different signs,
“stop”, “speed_10”, “speed_20”, “speed_30”.  

 



As we confirm that the model successfully detects the object, we are now trying to train the model with the images that are appropriate for our project which are desk and cellphone. We put the desk in our object detection category because we want our application can return the location of the detected object. For example, the proposing application will return "the 'object' that you are looking for is near the 'desk'.". 

We collected the images by Google open source program. We should use jpg file instead of jpeg, png file because of the difference in color channels. 
Also, we generated the XML file by labelImage.py.  When we find the area for the object that we need to classify, XML file is automatically generated with the name of the image file, folder, and path .

We downloaded the image in 'Downloads/desk1' folder, so when we move the images and XML files to the folder where we want to use, the name does not match. To solve this issue, we are changing the folder name for all XML files.



Then, we are also working on making the web page for the user interface. For the target user who has physical disabilities, the user can put speech input for searching term. For this function, we used the google open source STT(Speech To Text), but it is only available in the Chrome browser. 
We are going to design the web page that can be convenient for people with physical disability.
For example, the size of the button that can signify the user will put speech input will be adjusted so it can be easily clicked for the user. 
 

For the next week, we are going to work on the following steps
First, we are going to enable the system to send the input word to the object detection program.
Secondly, we are going to train the model with more data we customized and test if it successfully detect it. 
Thirdly, we are going to decide the position of camera that can detect the object well in the room. 

We hope we can complete above things to further our project.

Comments

Popular posts from this blog

Week11: Meeting with Dr. Min and project revisit

Week12: Demo, User Study, Feedback, and Acknowledgement

Week5: Preparing for proposal and presentation