Artificial Intelligence (AI) has been put forward as the technology that will change the world in the coming decades. Many applications already have seen the light including recommendations of content, spam filtering, search engines, voice recognition, chatbots, computer vision, handwriting recognition, machine translation, financial fraud detection, medical diagnosis, education, transport and logistics, autonomous vehicles, optimization of storage facilities, etc, etc. However, creating AI applications also introduces challenges, some of which come from related technologies and areas, while others are specific to AI. In order to create sustainable AI systems, several key aspects have to be considered from the beginning of the development process (“by design”), rather than being applied as an afterthought, including data, security, privacy, and fairness.
Data by Design
Data by design refers to the process that organizations consider data (collection, storage, analysis, and usage) as an integral part of doing business. Many non-digital organizations that not apply this principle suffer from typical problems such as:
- Data accessibility. Too often, data is hidden in complex IT systems and/or sits with a vendor. Getting access to the data is often costly and time-consuming.
- Data ownership. Organizations work with many service providers to deliver their e2e value proposition. Oftentimes, the contracts with those providers do not clearly state the ownership of the data, leading to confusion and complex conversations when the data is needed for a new value proposition.
- Data quality. When data is not managed as an asset, there are no quality procedures in place. Checking data quality as late as during the analytics phase is complex and expensive, and should be automated as close as possible to its source.
Organizations that fulfill the Data by Design principle have instant access to all relevant data with sufficient quality and are clear on the ownership of the data for the foreseen uses.
Security by Design
AI systems are powerful systems that can do much good in the hands of good people, but consequently, they can also do much harm in the hands of bad people. Therefore, one of the key aspects of AI development is “Security by Design”. Security by Design is “an approach to software and hardware development that seeks to make systems as free of vulnerabilities and impervious to attack as possible through such measures as continuous testing, authentication safeguards and adherence to best programming practices. It puts emphasis on security risks at all phases of product development, including the development methodology itself: requirements, design, development, testing, deployment, operation, and maintenance. And is extended to third parties involved in the creation process.
Privacy by Design
AI systems are fuelled by data, and therefore another important principle is “privacy by design”. Privacy by design calls for privacy and data protection to be considered throughout the whole engineering process. It was originally developed by Dr. Ann Cavoukian, Information and Privacy Commissioner of Ontario, and is based on seven principles:
- Proactive not reactive; preventative, not remedial
- Privacy as the default setting
- Privacy embedded into the design
- Full functionality – positive-sum, not zero-sum
- End-to-end security – full lifecycle protection
- Visibility and transparency – keep it open
- Respect for user privacy – keep it user-centric
|Figure 1 The seven principles of Privacy by Design (source)|
Fairness by Design
AI systems support us in making decisions or make decisions on behalf of us. AI and Machine Learning (a subfield of AI) have proven to be very effective in analyzing huge amounts of data to come up with “objective” insights. It are those insights that help to make more, objective, data-driven decisions. However, when we let Machine Learning techniques come up with those insights, we need to make sure that the results created are fair and explainable, especially when decisions have an impact on people’s lives, such as medical diagnosis or loan granting. In particular, we need to make sure that:
- The results do not discriminate between different groups of people on the basis of race, nationality, ethnic origin, religion, gender, sexual orientation, marital status, age, disability, or family responsibility. We, therefore, need to minimize the likelihood that the training data sets we use, create or reinforce unfair bias or discrimination
- When optimizing a machine learning algorithm for accuracy in terms of false positives and negatives, one should consider the impact on the specific domain. A false positive is when the system “thinks” someone has, for example, a disease, whereas the person is healthy. A false negative is when a healthy person is incorrectly diagnosed as having a disease. With less false positives and negatives, an algorithm is more accurate, however, minimizing on one usually increases the other. Depending on the domain, false positives and false negatives may have different impacts and therefore need to be taken into account when optimizing algorithms.
- The AI systems are able to explain the “logic” of why it has come to a certain decision, especially for live-impacting decisions. AI systems should not be black boxes.
When we build AI systems using those four principles- Data by Design, Security by Design, Privacy by Design and Fairness by Design- we can be more assured that we not only build performing systems but also secure, privacy-respecting and ethical systems. And this, in turn, we lead to greater acceptance of AI systems in the long run by societies and governments.