TABLE OF CONTENTS
ABSTRACT v
CHAPTER 2: LITERATURE REVIEW 7
CHAPTER 3: REQUIREMENTS, ANALYSIS, AND DESIGN 18
- Overview 18
3.8.6 User Interface Design 31
CHAPTER 4: IMPLEMENTATION AND TESTING 34
Test case TC-001(User Login) 40
Test case TC-002(User Registration) 41
Test case TC-003 (Object-Detection) 42
Test case TC-004(Text-to-Speech) 43
- Test Traceability Matrix (for Unit Testing, Integration Testing, and System Testing) 44
- Test Report Summary (for Unit Testing, Integration Testing, and System Testing) 44
- Error Reports and Corrections 45
- Use Guide 45
- Summary 45
CHAPTER 5: DISCUSSION, CONCLUSION, AND RECOMMENDATIONS 46
Test case TC-001(User Login) 57
Test case TC-002(User Registration) 58
Test case TC-003 (Object-Detection) 59
Test case TC-004(Text-to-Speech) 60
LIST OF TABLES
TABLE 1.1 | RISK ASSESSMENT | 5 |
TABLE 3.1 | FUNCTIONAL REQUIREMENT SPECIFICATIONS | 24 |
TABLE 3.2 | NON-FUNCTIONAL REQUIREMENT SPECIFICATIONS | 25 |
TABLE 4.1 | TEST SUITE FOR LOGIN | 40 |
TABLE 4.2 TEST SUITE FOR REGISTRATION 41
TABLE 4.3 TEST SUITE FOR OBJECT DETECTION 42
TABLE 4.4 TEST SUITE FOR TEXT-TO-SPEECH 43
TABLE 4.5 TEST TRACEABILITY MATRIX 44
TABLE 4.6 TEST REPORT SUMMARY 44
LIST OF FIGURES
FIGURE 3.1 | AGILE METHODOLOGY VS WATERFALL METHODOLOGY | 19 |
FIGURE 3.2 | EXAMPLE OBJECT DETECTION TENSORFLOW LITE | 22 |
FIGURE 3.3 | ANDROID TEXT-TO-SPEECH WORKFLOW | 23 |
FIGURE 3.4 | APPLICATION ARCHITECTURE | 26 |
FIGURE 3.5 | USE CASE DIAGRAM | 27 |
FIGURE 3.6 | ACTIVITY DIAGRAM | 28 |
FIGURE 3.7 | DATA-FLOW DIAGRAM | 29 |
FIGURE 3.8 | ENTITY-RELATIONSHIP DIAGRAM | 30 |
FIGURE 3.9 | LOGIN PAGE | 31 |
FIGURE 3.10 | REGISTRATION PAGE | 32 |
FIGURE 3.11 | OBJECT RECOGNITION INTERFACE | 33 |
FIGURE 4.1 | LOGGED DETECTION RESULTS | 36 |
FIGURE 4.2 | LOGGED DETECTION RESULTS WITHOUT DELAY | 37 |
LIST OF ABBREVIATIONS
CPU | Central Processing Unit |
ERD | Entity Relationship Diagram |
IT | Information Technology |
ML | Machine Learning |
AI | Artificial Intelligence |
CV | Computer Vision |
CNN | Convolutional Neural Network |
RNN | Recurrent Neural Network |
RAM | Random Access Memory |
UML | Unified Modeling Language |
CHAPTER 1: INTRODUCTION
Overview
The aim of this project is to use an amalgamation of advancements in both smartphone technology, as it pertains to processing power, and camera technology, in tandem with advances in machine learning and computer vision, to build a mobile application solution that helps the visually impaired to carry out their day-to-day activities, as well as attempting to contextualize this application for a Nigerian user base.
With the increasing ubiquity of smartphones available, various day-to-day problems have found unique solutions. There have also been multiple smartphone application solutions for the visually impaired that have taken varying approaches, some taking an emergency- service based approach, others recognizing specific items like currencies that are essentials to daily life. For example ‘eyeNote’ and ‘LookTel’ are applications that recognize currencies and audibly communicate this to the user, whereas a project like ‘BlindSighted’ notifies a user by buzzing whenever the user is within close range of an object. This project will be taking the approach of audibly communicating objects to users, when recognized by the mobile application. (Ghantous, Nahas, Ghamloush and Rida, 2014)
The following chapters of this thesis will succinctly provide analyses, design and implementation of this object detection system to help the visually impaired.
Background and Motivation
Individuals with visual impairments are defined by the World Health Organization (WHO) as those who suffer from low vision or blindness. (World Health Organization, 1992). Due to their ailment, these individuals face many challenges in their day-to-day activities.
Throughout human history a myriad of devices and methods to overcome these difficulties have been devised, from more traditional devices such as walking sticks and reading glasses to Braille, which is a system of touch based reading and writing, to more recently, assistive devices. Assistive devices (specialized high and low technology tools designed for individuals with disabilities) increase the ability of visually impaired individuals to better understand their environment. These devices range from specialized screen-reading software, magnification programs and daisy book readers (Martiniello et al., 2019). Despite their established utility, widespread adoption of these devices has been hindered by factors such as cost and negative perceptions associated with vision loss (Mulloy et al., 2014).
According to the World Health Organization, at least 2.2 billion people suffer from a visual impairment or blindness globally. Of these, at least 1 billion have a visual impairment that could’ve been prevented, or is yet to be addressed. (World Health Organization, 2020)
In the past few decades, smartphones and tablets have become increasingly popular and have become a staple of mainstream society. Overtime, as a result of technological advancements a large amount of in-built accessibility tools have been incorporated within these devices, which create and maximize accessibility for users with a diverse set of needs. (Martiniello et al., 2019). These devices, unlike traditional assistive devices, have already achieved widespread adoption, furthermore they are more affordable and are less likely to draw attention to the user, avoiding any negative perceptions. Alongside the in-built accessibility tools, smartphone operating systems provide developer platforms that allow developers to leverage the devices capabilities to build third-party applications for users; amongst these are assistive / accessibility applications.
Given the ubiquity of smartphones – there are currently about 3.5 billion smartphones worldwide – majority of which are running one of iOS and Android, it only makes sense to build applications, especially those geared towards accessibility on these platforms. Leveraging off this ubiquity allows us to build accessibility, faster, to those who need it most.
Furthermore smartphone camera technology and computer vision algorithms have both been improving at a rapid rate. With object recognition, one could simulate seeing for the visually impaired in a better fashion than the traditional methods currently available, without having to compromise for cost or societal perceptions. This would all be achieved by using technology that’s currently available; camera technology, computer vision algorithms and a voice assistant, to build an object recognition application that labels surrounding objects and then audibly communicates the label recognized to the user.
Adaptability and support is a facet of smartphones and smartphone applications that isn’t available in traditional assistive devices, alongside costs and stigma these are also factors that cause the abandonment of traditional assistive devices (Phillips and Proulx, 2018). Which is another source of motivation for this project, as smartphone applications are able to achieve continued support by benefit from ‘over the air’ updates to improve user experiences. Moreover, applications can be adapted to be contextualized to different demographics with respect to multiple criteria, for example, age, and geographic location. This is paramount especially in the case of object recognition applications built with artificial intelligence, object recognition models should be adaptable to various languages and audiences to ensure that all users can adequately benefit from it.
Statement of the Problem
The ability for an individual to recognize their objects and their surroundings is a quintessential aspect of being able to operate self-sufficiently. Carrying out even the most menial, routine tasks rely on this capability. Hence, operating independently can become extremely difficult for those who suffer from visual impairments.
Due to their ailment, visually impaired individuals can face many hurdles in tasks others might recognize to be simple daily tasks, where there have been attempts to solve this through reading glasses, walking sticks and even surgery. These methods may either be financially infeasible for some, while the other solutions may only be workaround type solutions. Assistive technologies have also been used and have been shown to increase users’ access to their environment and information, however they have failed to achieve widespread adoption, this is due to a plethora of reasons, namely, cost, lack of technical support and the stigma attached to using these devices in public.
Leveraging the widespread adoption of smartphone technology, these issues can be further curbed without having to compromise for cost, support, stigma and made instantly widely available by using computer vision and smartphone camera technology, to build an application that recognizes objects in a users surroundings and audibly communicates to them.
Aim and Objectives
This project proposes an object recognition system for the visually impaired, which will be built as a mobile application running on Android. Android Studio / Java will be used to design the graphic user interface as well as the functionalities. The application will provide an intuitive user interface that opens up to the camera and labelling nearby objects using object
recognition models audibly communicating them to the user. Over time the project will also evolve into building further contextualized models for various demographics.
Significance of the Project
The implementation of this project has the potential to benefit the visually impaired, and the Nigerian society. It would be immediately helpful to the visually impaired aiding in the execution of daily activities which will have an overall positive impact on their lives.
This project could also shed some light on how artificial intelligence and its various facets need to be built to contextualize the various different societies they’re implemented in, hopefully further encouraging Nigerian developers to participate in building models that can adequately capture the nuances and idiosyncrasies of Nigerian societies, better so than models builts by other developers could.
With helping the visually impaired community, and by encouraging Nigerian developers to build tools for Nigerians, this will overall have a positive lasting impact on Nigerian society as a whole. Furthermore it sheds light on the importance of artificial intelligence and machine learning towards each target demographic / audience.