We Don’t Need to Worry About Pneumonia Anymore
I remember the day so vividly — yet so hazily at the same time. I couldn’t breathe. Even the strangers walking past asked if I was okay. Leaning on the rails at the Chinese subway station, I could barely keep my balance until eventually, I fell unconscious.
3 hours later, I woke up in a hospital with an IV in my arm, and I was okay. I already had a history of pneumonia (thanks, asthma, I appreciate it). However, many people aren’t as fortunate as I was to be so close to a high-quality hospital that could quickly diagnose the problem.
In 2020, the World Health Organization reports that pneumonia is the single largest cause of death for children under the age of 5, accounting for 18% of all child deaths. To quantify that percentage in terms of impact, that means that every year, roughly 1.4 million children die of pneumonia. To make matters worse, many doctors don’t have the necessary tools to diagnose pneumonia, or even if they do, their diagnosis simply isn’t accurate enough.
Although there are a variety of tests from blood tests to pulse oximetry to sputum tests, the most reliable is chest X-ray scans. However, a whopping two-thirds of the planet lack access to basic radiology services.
This is especially unfair in developing countries — to put this into context, the city of Boston, which has a population of 680,000 people, lays home to 126 radiologists. Kenya, which has 43 million people, only has 200 radiologists. Something about that ratio seems a bit off…
For a disease that is so common amongst all ages, it’s crazy that much of the world seems so unprepared. It’s time we put an end to it.
Tools and Technologies
- Google Colab — Colab is a free, online cloud-based Jupyter Notebook that allows you to type Python code and train machine learning and deep learning models right in your browser.
- TensorFlow — TensorFlow is an open-source software library for machine learning. It comes pre-installed in Google Colab.
- Keras — Keras is an open-source neural network library.
- Convolutional Neural Networks — CNNs are a particular type of neural network that is specifically tailored to be able to process image data, which at the end of the day, is just a massive matrix of numbers.
5. Your Curiosity — Learning to create machine learning models is very circular. You will consistently run into new obstacles, but your curiosity will push you through 👀.
Plan of Attack
It’s usually best to have some kind of plan, here’s ours —
- Collecting the dataset
- Importing libraries
- Splitting the data
- Building our CNN
- Training and Testing
- Predicting
Step 1 — Collect The Data
Before we start building our network, we need to find a dataset that our network will be able to train and test on. I found such a dataset on Kaggle. It contains 5,863 Chest X-ray images which are divided into 3 categories: train, test, and validation. Here’s an example of a chest X-ray scan:
Each of these categories has 2 subfolders which separate Normal X-ray images and Opacity (pneumonia) X-ray images.
Due to the fact that the images are all already separated for us, we can use this to our advantage in our code by automatically generating appropriate labels based on the folder name, and with that segway, let’s get into the code.
Step 2 — Importing Libraries
To utilize the full power of these Python libraries, we first need to import them into our code. Notice I imported TensorFlow with a GPU — you’ll want this to make the process quicker.
Step 3 — Splitting the Data
Image preprocessing is an essential part of creating any CNN, so the first step is converting the images’ RGB coefficients into something that our computer can computationally cope with. By dividing all of the coefficients by 255, we’re essentially changing the scale from 0–255 to 0–1; this makes life a lot easier for our computer to not only deal with but also to make predictions depending on a certain threshold.
Thankfully, our dataset is already split into normal folders and pneumonia folders. This means that all we have to do is call their directory and flow them into their respective lists.
train.flow_from_directory makes things even better, automatically generating labels based on the name of the folder that the image is in. As you can see, our computer found 4192 images in the 2 classes in the train set and 1040 images in the 2 classes in the test set.
ps. I actually chose to use the validation set as my test set because it had more data samples to use, but this doesn’t actually change anything about the CNN.
Step 4 — Building our CNN
There are several important concepts that allow Convolutional Neural Networks to adequately simplify and comprehend image data.
- Convolution is the idea of applying filters on the input image and slowly mapping them across the entire input, thus altering the image and allowing our network to discern certain features.
Let’s observe this image as an example. Notice the input data on the left, and the weighted kernel (filter) in the middle, and the convoluted feature on the right. Without hanging off the edge of the input data, the weighted kernel can position itself in exactly 9 different positions, hence the simplified, convoluted feature on the right having 9 spots.
The top left value of the kernel is multiplied with the top left value of the white portion of the input data, and then add that to the top middle times the top middle, plus top right times the top right and so on. The math is shown on the right of the image, and so 4 is placed into its appropriate position in the returned matrix.
So how does the weighted kernel actually return a feature? It does so by applying weights to the pixels and essentially determining its level of importance.
When looking at the input matrix on the left, a human could quite clearly recognize that all of the pixels with the value of 10 next to the pixels with the value of 0 would create a vertical line, but that’s because we can observe them all at once. For our computer, this would be computationally expensive and inefficient, and thus we apply this weighted kernel to our matrix.
The idea behind it is that if all of the pixels are the same, the kernel will simple return 0 because the values times 1 plus the values times -1 will cancel themselves out. However, when these values aren’t canceled out, the kernel returns a convoluted feature somewhat to the extent of our matrix on the right, which quite clearly depicts a vertical edge in a simpler fashion that our computer will have an easier time understanding.
2. Pooling builds on top of that final idea of convolution and trying to dumb-down our input data. Essentially, pooling is the simplification of regions of pixels — this allows us to significantly reduce the size of our input.
Think about it this way. Pixels are so small that pixels next to each other are very likely to be quite similar. Thus, we can sacrifice a tiny bit of our network’s accuracy for much greater efficiency by generalizing areas of pixels into one value, which essentially represents the whole area.
One way of doing this is with Max-Pooling, where the selected pixel is the one with the highest value of all others in the same region.
3. Flattening is the process of actually taking our simplified, extracted features and feeding them into a traditional neural network. We can see exactly where the flattening process takes place in the example below.
Notice: In this example CNN, the user wants to classify an input image into multiple categories (car, truck, van, bicycle, etc), and thus they appropriately use the Softmax Activation Function, which assigns a probability to each of the output values so the network can choose the object with the highest probability. In our situation of determining if the X-ray scan is normal vs. has pneumonia, we will use the Sigmoid Activation Function.
4. The final concept is Dropout, which simply randomly removes certain data samples so that our network does not become too familiar with our dataset to the point where it loses accuracy over general data. This prevents overfitting.
Step 5 — Training and Testing
Now that the bulk of the heavy lifting is done, our model is ready to train and test itself. As we can see, it becomes increasingly more accurate after each epoch, tapering off at about 96% accurate in both training and testing!
Right now, we’re in the right balance between not underfitting and not overfitting, so it’s best to leave our network here. After all, 96% is pretty spectacular!
Step 6 — Predicting
As a bonus step, I was curious to see how our network would cope with a single input and if it would determine its correct state. Sure enough, it did. I took an image its never seen before, set it to the correct dimensions, fed it in, and sure enough, our network returned a 1, meaning it predicts that the patient has pneumonia. Correctamundo!
A Bright Future for Machine Learning
Projects like these aren’t meant to be conceptual, however. They can be applied to actually save lives.
In the United Kingdom, GE Healthcare recently developed and introduced its new AI which rapidly analyzes chest X-ray scans and is able to determine and alert radiologists of up to 8 abnormalities including pneumonia indicative of Covid-19, tuberculosis, fibrosis, and others.
Moreover, it showed a 34% reduction in reading time per case while simultaneously maintaining accuracy levels of 97–99%. Incredible!
I’m optimistic because this is only the beginning — the beginning of a world where clinicians can help patients regardless of location, cost, status, or any other preventative factor.