Over the last few months, I’ve been interested in artificial intelligence and its relevance to eye care.
To dive deep into this topic we will need to understand the power of algorithms. Then after that, we will dive into deep learning — a form of artificial intelligence (AI). Following that, we will then focus on what application of AI in eye care. We will, then, conclude with the ethics of AI.
Here is a video summary of the topic:
An algorithm is what computers use to solve problems, taking input and provide the desired output.
Here is an example. We have a list of names that we need to sort alphabetically. We could do this by hand, but let’s leverage the power of computing to solve this problem for us instead. After all, a computer can perform tasks tirelessly and without error, if they are programming correctly of course.
Also, imagine if we had hundreds or thousands of patients, then the tasks becomes very long and tedious.
We can construct a simple algorithm to loop through the list of names and compare this to the name first on the list. If the name falls before the first-most name alphabetically, then this will take the top spot. We then repeat this for the entire list of names.
I’ve coded this example in C. Here it is below.
We have some interesting functions:
strcmp(string1, string2)– this compares strings and if the value is above 0, then
string1is ‘greater’ than
string2. In other words,
string2in the alphabet.
strcpy(string1, string2)– when the above condition is met, this copies the value of
string1. In our case, the later-in-the-alphabet name is copied to
s. The name is replaced by the earlier-in-the-alphabet name. The name assigned to
sthen takes its place.
We can see that two
for loops are implemented to make this happen. This means if the list of names increases by
n, then the time for this algorithm to be executed will increase by a square factor,
n^2. This means that this is not the most efficient algorithm, but it works and I’m happy with that.
We compile the code and then run it. We can see the output below.
The Limits of Algorithms — Why we need AI
Sorting names alphabetically can be performed quite easily for computers. However, grading retinal images for diabetic retinopathy is a challenging problem for computers to perform.
Grading retinal images for diabetic retinopathy (DR) is an image classification problem. We have images, and we need to grade them into these possible categories: healthy, background DR referrable DR.
Why is this a noble pursuit?
In Aotearoa, 250,000 people have diabetes and a quarter have DR. Fortunately, individuals with diabetes are screened with routine retinal imaging taken at least biennially. But, unfortunately, that’s a lot of images to grade! On top of this, grading these images requires experts — of which there are too few.
Can AI plug this gap?
What is Deep Learning?
But first, what is deep learning? Deep learning is a subset of machine learning. Machine learning is where computers are programmed to learn from data. Machine learning includes methods from statistics like regression and optimisation from calculus.
Deep learning takes these mathematics principles to the next level. Taking inspiration from the human brain’s architecture, deep learning involves virtual neurons that interact and interconnect. An example of what deep learning can do is classify images.
Images are translated into a pixel by pixel array. This is then fed to the input layer. The input layer connects to intermediate or hidden layers, which respond to certain elements of the image. The earlier layers respond to edges and diagonals, while the latter hidden layers respond to more complicated elements in the image. In the end, an output layer provides the result (e.g. cat or dog).
The deep learning model is able to respond in this way because it has been trained on known images. This means well-curated data is needed to train the model, so it responds in the way we want.
Let’s create our own image classier. The task is to classify metal frames and acetate frames. We collect this data using Bing Image Search.
However, we can see that this isn’t perfect. We get metal frames, but not the ones that we were quite thinking about.
Back to the drawing board… Gathering data is difficult. In the next newsletter, we will dive deeper into deep learning. Let’s change the scope of what we are learning by detecting blue eyes and brown eyes.
And here we go, running an image search from Bing:
We can already see some immediate problems with our data. There are brown eyes being labelled as blue eyes possibly due to the makeup. There is even an image with blue and brown eyes. But we will proceed anyway.
Next, we augment our images. This involves manipulating the image at random, changing the contrast, brightness, even skewing the image. This aims to increase the size of the data set, but it also creates random variation, increasing the chance of the model to generalise on training.
Now, we train the model on the data we have collected and augmented. We will be using a convolutional neural network. This is a type of neural network that is best suited for image recognition. It closely resembles the inner workings of the human visual system.
We give the model four opportunities to run through the dataset or “epoch”, from zero to three. We can see the
error_rate reduce with each epoch.
Under the hood, all 150 images are split into training and validation (usually 80% and 20%, respectively). The model will train on the training images and then validate against the validation set.
When running over the test examples, we can determine the images the model performs worse on.
Here, we have examples where the model misidentified the eye colour. Some examples have the colour blue in the image. The possible reason for this misidentification is likely the model is picking up the colour of the background or if any makeup is present. Furthermore, we knew our training data wasn’t perfect as well.
Below we have a confusion matrix showing the actual eye colours and what the model predicted. We can see some that it recognises incorrectly, but there seems to be a trend of correct identification (that is better than flipping a coin!).
Let’s test this model on some real-world examples. I’d love the thank my friends who gave me permission to use their handsome faces in the name of whatever I’m trying to accomplish.
We can definitely tell that this fine gentleman has brown eyes. On the other hand, the model is pretty certain that this individual has blue eyes with a probability greater than 90%. The model is most likely picking up on blue clothing.
We have a correct prediction but a not very confident probability of 69% (oh yeah!).
Finally, we try it on me. Again, correct prediction but not very confident.
We can see there is some limitation to the model we have created. There are two major barriers:
- Imperfect data to train the model.
- We are also using a CNN from scratch.
The imperfect data is self-explanatory. The issue is our model is fresh. It has to learn what edges, colours, patterns from scratch. We can use transfer learning. This is where a pre-trained model can be used instead. A pre-trained model is already astute to the basics of what is in an image and so data is better utilised for our objective: eye colour.
Ethics of AI
AI is exciting!
But we found there are a few limitations. One was with the data used to train the model. Unclean data produce unclean results. Second is the model we are using begins life very naive. We can use a pre-trained model that already knows the basics of an image like edges and textures, so it has to learn is eye colours.
Determining eye colour, though a fun introductory project, has a very low consequence. In other words, if the model gets it wrong, it’s humorous and we can move on with our lives. What happens if the weight of our consequences is greater? Flying a plane full of people? Orchestrating a network of motor vehicles? Diagnosing or discharging a patient? Now the stakes are serious. Life or death.
Pedalling back to our eye colour model, an immediate problem was with our data. Apart from the data set being was relatively small, the data was far from ideal.
Some of the data had makeup, some photos labelled as ‘blue eyes’ has both brown eyes and blue eyes. It also seems more images of fair-skinned people have blue eyes and brown eyes are represented in a multitude of skin colours. When training a model, these biases present in the data are amplified.
Changing gears to eye care and health. Currently, in Aotearoa, individuals with diabetes are put onto a diabetic screening service. In New Zealand, we have over a quarter of a million have diabetes. The number is estimated to rise, and needing retinal images taken and graded at least every 2 years. Already stretched health services will be pushed beyond their limits. Not to also mentions, district health boards are almost half a billion in debt.
Grading retinal images usually require specially trained individuals of which there are too few. The task involves pattern recognition of images. Just like our AI model we trained to classify blue eyes and brown eyes, we can do the same with retinal images.
Already existing is a data set of already graded retinal images, which can be used to train a model. This model can be used to grade retinal images, reducing the burden on healthcare.
This is the idea on paper.
But as we discovered, it’s not easy training to model to do what we want.
Even in health data, there is bias. A large proportion of Pacific People have diabetic eye disease. There is a chance the model would correlate a pigmented fundus to diabetic eye disease rather than the actual clinical signs.
Furthermore, the data required needs to be diverse. For example, a model trained on European eyes won’t be appropriate to use in the New Zealand population due to our ethnic diversity. This means the model will likely perform better on one ethnic group (usually the majority) and then poorly on minority groups.
This can result in further health disparity when AI should be trying to plug the leaks in healthcare systems.
The topic of ethics in AI is very deep. We must understand that AI is a powerful tool. Neither good nor bad. It is how we implement this tool that is important.
Diverse teams are an excellent way to combat bias in developing AI models and products. Members can provide different insights from differing backgrounds.
Algorithms allow us to process data faster and more accurately. Excellent for simple and repetitive tasks like sorting names in a list. However, many repetitive tasks cannot be easily programmed into a computer.
The next stage is utilising deep learning. Deep learning is a subset of machine learning that closely resembles the human brain’s visual cortex.
This is what we can use for recognise images in the case of Diabetic Retinal Screenings.
We can see that AI has great potential but we must ensure that we approach this with ethics in mind. AI is a powerful tool. A tool on it’s own has no moral weight, but could be used for good or evil.