Tech

Build a simple Neural Network for Breast Cancer Detection using Tensorflow.js

There's more and more research done on detecting all types of cancers in early stages and thus increasing probability of survival. Especially with the artificial intelligence and machine learning know-how that's constantly evolving.

Since I've been passionate about machine learning for a while, I decided to bring my own contribution to this research and learn to train my own neural network detection model. The twist was to build it using Tensorflow with JavaScript, not with Python. We're also using React to manage the state and display the data we get back from the model.

The Beginning: Breast Cancer Dataset

For this tutorial, I chose to work with a breast cancer dataset. Using this, my aim was to create a neural network for breast cancer detection, starting from filtering the dataset to delivering predictions. This aims to observe which features are most helpful in predicting types of cancer, with the main goal being to classify whether the cancer is malignant or benign.

This dataset was the first machine learning project of my final semester back in 2014 when there weren't any machine learning libraries like Tensorflow.js, Caffe, PyTorch, Theano or Keras.

For a long time, I was using only Matlab and R language, basic machine learning algorithms like RandomForest, Support Vector Machines, Decision Tree, Naive Bayes as I remember. During that time, I also started to understand basics of machine learning and how strongly this can impact the medical field. Especially since there's millions of real data points from medical centers, allowing us to create high-accuracy neural network models.

Breast cancer is one of the most common cancers among women worldwide. The early diagnosis of this type of cancer can improve the rate of survival significantly. The current method for detecting breast cancer is a mammogram which is an X-ray breast tissue that is used for predictions.

While further researching, I discovered a very well-documented project about Breast Cancer in Python, using Keras and this project helped me better understand the dataset and how to use it. This project can be found here.

All along, I was guided by Adriana Birlutiu, Phd and assistant professor with machine learning - her amazing research in machine learning field can be found here.

Dataset filtering and preparation

Since a neural network requires input data for performing training and predictions of his network, I used the dataset from UCL Machine Learning repository (here). It consists of data taken from patients with solid breast masses. It's basically a collection of biopsies and their given diagnoses.

Data attributes

In this dataset there are 30 values in vector format taken from a single image. There are 30 attributes rather than 10 because a single feature contains MEAN radius, SE radius, WORST radius, etc.

Unused attributes refer to ID numbers and of course we will remove this attribute from the dataset.

Main target: Diagnosis (M = malignant or B = benign) we will  replace values with M=1 and B=0.

Most important attributes for each cell:

  1. radius (mean of distances from the center to points on the perimeter);
  2. texture (standard deviation of gray-scale values);
  3. perimeter;
  4. area;
  5. smoothness (local variation in radius lengths);
  6. compactness (perimeter² / area — 1.0);
  7. concavity (severity of concave portions of the contour);
  8. concave points (number of concave portions of the contour);
  9. symmetry;
  10. fractal dimension (“coastline approximation” — 1).

Because we need to train our network model and we will use supervised training from data, we split the dataset into two vector variables, xTrain, and yTrain.

We also need test data - xTest, yTest - to adjust the parameters of the model, to reduce bias in our predictions and to increase accuracy in our data. We will use the test data to provide an unbiased evaluation of the final model. This data is not seen by our model because we do not use it on the training process.

Let's picture the process: we create a new vector variable, split five values from xTrain and three of them we will update with random values. Some of them will retrieve 'bad' values and in that way, we will know if our network model can predict right or wrong types of cancer from test data.

The expected result is to predict two correct values and three wrongs from test data. The two correct predictions should be malignant or benign.

Let’s define the variables:

  • xTrain = Most important attributes for each cell;
  • yTrain = Main target with values 1/0 (M=1, B=0);
  • xTest = Five real values from xTrain with two right values (malignant, benign) and I updated three of them with random values;
  • yTest = I added five vector values as a final target with two right values (malignant, benign) and three with the wrong values.

Network model architecture for Breast cancer Detection

Network architecture requires a lot of attention and dedication if you want to optimize your training model, minimize error and maximize accuracy using the right algorithms and optimizers.

Usually, there is not one-fits-all architecture solution - you can build your neural network layers in an infinite way and all of them can have different performance.

Let’s start with the beginning: we have a simple classifier as the base of a neural network model which performs actions on input data. Our network model is simply a network of layers connected with biases and weights. These weights can create actions on nodes and determine which node will activate another node.

Literally, there are 3 different layers in a neural network :

  1. Input Layer: All the inputs are fed in the model through this layer
  2. Hidden Layers: In network can be more than one hidden layers which are used for processing the inputs received from the input layers
  3. Output Layer: The final result after processing is made available at the output layer)

Once we have data filtered for this model, we will have 4 layers using Dense Layers that are fully connected and activation functions Relu (Rectified Linear Unit) and Sigmoid:

  1. Input Layer with dimension 30, because input data have 30 lengths, we will use the node dimension of 30 and 16 nodes and Relu activation;
  2. Hidden Layer with 8 nodes activation Relu;
  3. Hidden Layer with 8 nodes, activation Relu;
  4. Output Layer with 1 node, with activation sigmoid because we have values between 1 and 0 and that is a perfect choice.

Training model

I wanted to create something different and learn how to use Tensorflow with JavaScript, not Python.

Using Tensorflow.js to create and train your models is amazing but there are pros and cons - one of them is the speed of training. Especially in-browser, where it's influenced by WebGL, and in the backend CPU (Node.js). In these cases, the training model takes too long to load - even with low data - with the significant disadvantage posed by machine learning libraries that are not fully released yet in JavaScript (especially libraries for data manipulation).

Migrating from Python to JavaScript is a bit difficult due to lots of syntax problems and due to interpretation of the model code.

Some of the functions are only there for an explanation so you can find the full codebase of the project here.

1. Install Tensorflow.js using npm

npm install @tensorflow/tfjs

2. Import Tensorflow.js in your component

import * as tf from "@tensorflow/tfjs";

Before we start working with TensorFlow, we need to filter data first. So we will create a constant variable called data (you can found on my GitHub link the code for this function).

const data = filterData(dataset);

Switching from Python to Tensorflow.js is a really different approach. This time, you have to define your tensors, especially for your dataset dimensions (find more about this topic here).

    const xTensorTrain = tf.tensor2d(data.xTrain, [
    data.xTrain.length,
    data.xTrain[0].length
]);
const yTensorTrain = tf.tensor2d(data.yTrain, [
    data.yTrain.length,
    data.yTrain[0].length
]);

Every time you have to convert your data array into tensor, depending on your dataset, that can be 2D tensor, 3D tensor or 4D tensor. If you are not familiar with tensors, you can learn more about this subject here.

3. Building the model

For this approach, we use models from Keras, to create and to build layers, to connect nodes and to define units, input data dimensions, and activators for each layer.

const model = tf.sequential();


4. Epochs and batches  

Epoch is a hyper-parameter that defines the number of times the learning algorithm will take actions on the entire dataset. One epoch means the sample training data will update the parameters with every iteration.

Batch is a hyper-parameter that defines numbers of samples to work through before updating the internal model parameter. One of the most important things, the size of batches must be more than (or equal to) one and less than (or equal to) the size of the dataset number. For that dataset, I set batchSize =1 (Stochastic Gradient Descent) and at least a minimum of 50 epochs go get good accuracy on the dataset.

const batchSize = parseInt(this.state.batchIterations) || 1;
const trainEpochs = parseInt(this.state.epochs) || 1;

5. Building layers

You want to build a neural network model, define layers and set desired values inside layers, activators, and units. This allows you to design your network model in unlimited ways.

Let’s start with the input layer.

The input layer is responsible for input data. We created constant variables for each layer and each layer has different values and activators.

Based on the dataset and features, the input layer starts with 30 values, because we have 30 parameters in the dataset. Why do we use activators in our example? We first need to determine the output of the neural network. It maps results between 1 and 0.

Activators can be Linear Regression or Non-Linear Regression. Using Relu activator helps us update values and all negative values become zero immediately, which helps a lot with training progress. What are the units? Units are dimensions of neurons in each layer, or simply nodes that have linked directly with other nodes in another layer.

const inputLayer = tf.layers.dense({
    units: 16,
    inputDim: 30,
    activation: "relu"
});

What are the hidden layers and how do they work?

The hidden layer is the in-resize between the input layer and output layer and simply calculates the neurons’ sum of inputs and weights, adds biases and executes an activation function.

const hiddenLayer = tf.layers.dense({
    units: 8,
    activation: "relu"
});

At this point, we get to output layer.

This layer is responsible for providing the final result of the training dataset, it takes on the inputs which are passed from hidden layers, performs calculations via its neurons and then compute the output.

We should always have one output layer in every neural network model architecture. This is where I get to the reason why I used a sigmoid function - to perform outputs in a perfect way, we need to have right activators and sigmoid is a logistic function that transforms outputs into values between 0.1 and 1.0. Inputs that are larger that 1.0 are transformed automatically in 1.0. and if values is less that 0.1, it will automatically transform them into 0.0.

const outputLayer = tf.layers.dense({
    units: 1,
    activation: "sigmoid"
});

Finally, we design our model with layers using activators, units, and input dimensions. Then we add layers to the model.

model.add(inputLayer);
model.add(hiddenLayer);
model.add(hiddenLayer2);
model.add(outputLayer);

6. Compile the neural network model

Every neural network model is designed with optimizers, loss functions, and metrics.

What are optimizers and how it works?

Optimizers help us minimize or maximize the loss function and some of them use gradient functions or gradient descent function on loss values. That function plays a big role in optimization, also optimizers are called learning rate.

We will also use rmsprop, an optimization algorithm whose role is full batch optimization. It can be used in solving the problem of gradients and with variable magnitudes.  It doesn’t work if we have a small learning rate and very large datasets, so this optimizer is perfect for our training dataset.

Loss function and meanSquaredError

One of the most important processes of training neural networks are loss functions.

Some of the functions are called error functions and calculate the average loss over the entire training dataset. They aim to minimize the cost function, which is literally the function penalizing the output data.

Error functions show how far off the result of the network is produced from the expected result, and indicate the magnitude of error in our model that made its prediction.

Mean squared error

This method transforms differences between the actual and the predicted values into square values. This result of the function always retrieves non-negative values.

Metrics - their role and how they work

Metrics is the ratio of the number of correct predictions and the total number of output samples. Adding accuracy in our model, we will have an accuracy  at the end of the iteration of every epoch, and we'll also be able to make calculations or apply data into charts at every iteration of epochs.

model.compile({
    optimizer: "rmsprop",
    loss: "meanSquaredError",
    metrics: ["accuracy"]
});

7. We have our model compiled, it’s time for training

First, I created a variable called valAcc to keep the final result of epochs and to calculate a percent of final validation accuracy and final test accuracy.

Let’s start training. After that model was compiled, all we need is to add the filtered data as 2d tensors into the model and define epochs to get logs for every epoch iteration.

Training progress can take up to 30 minutes, depending on your hardware so just sit and relax, take a big cup of coffee and enjoy looking at training steps.

let valAcc;
await model
    .fit(xTensorTrain, yTensorTrain, {
        batchSize: batchSize,
        epochs: trainEpochs,
        callbacks: {
            onEpochEnd: async (epoch, logs) => {
                valAcc = logs.acc;
                this.addConsoleValue(
                    `Epoch =${epoch} Loss=${logs.loss}  Acc=${logs.acc}`
                );
            }
        }
    })
    .then(async () => {
        const testResult = model.evaluate(xTensorTrain, yTensorTrain);
        const testAccPercent = testResult[0].dataSync()[0] * 100;
        const finalValAccPercent = valAcc * 100;
        this.addConsoleValue(
            `Final validation accuracy: ${finalValAccPercent.toFixed(1)}%; ` +
            `Final test accuracy: ${testAccPercent.toFixed(1)}%`
        );
        await model.save("localstorage://breast-cancer-model");
        this.addConsoleValue("Model saved to localStorage");
    });

Good, our model has been trained and I decided to save the model in local storage on the browser because I do not want to run the training again just to get another results.

We got results in console, we have good final accuracy of training dataset and test accuracy. When our training accuracy is better than test accuracy, we get to what is called overfitting.

8. Prediction model

Let’s now predict our results.

First, we have to create variables that haven't been seen before by our model (xTest and yTest) 5 values that I split into filtering data from xTrain and yTrain.

The trained model was saved in localStorage so we can easily call every time we need to make predictions on trained data. I also created some basic comparison from yTest, the real features that should be predicted, and comparison between yTest values and predicted result values as arrays.

Prediction values could be between 0.0 and 1.0, so a prediction bigger than 0.5 can make a difference. That's why, if the predicted value in each iteration is bigger than 0.5, we have a malignant result, otherwise the result will be benign.

const xTestData = tf.tensor2d(xTest, [xTest.length, xTest[0].length]);
const yTestData = tf.tensor2d(yTest, [yTest.length, yTest[0].length]);
const modelStorage = await tf.loadLayersModel(
    "localstorage://breast-cancer-model"
);
tf.tidy(() => {
    const output = modelStorage.predict(xTestData);
    const predictions = output.dataSync();
    let correct = 0;
    let wrong = 0;
    let total = 0;
    const yTestValues = yTestData.dataSync();
    predictions.forEach((value, index) => {
        total++;
        if (yTestValues[index] === 1) {
            const fromTest = value >= 0.5 ? "Correct" : "Wrong";
            if (fromTest === "Correct") {
                correct++;
            } else {
                wrong++;
            }
            this.addConsoleValue(
                `[${index}] = Malignant, Dataset Test = ${fromTest}`
            );
        } else {
            const fromTest = value <= 0.5 ? "Correct" : "Wrong";
            if (fromTest === "Correct") {
                correct++;
            } else {
                wrong++;
            }
            this.addConsoleValue(
                `[${index}] = Benign, Dataset Test= ${fromTest} `
            );
        }
    });
    this.addConsoleValue(
        `Correct=${correct}  Wrong = ${wrong} Total=${total}`
    );
});

For the first two values I split at the beginning the real values from training data. Those values were never seen by the training model so the prediction is really good. Other three values are wrong because I updated them with random values and we got the expected results.

Conclusion

To sum up, you can use Tensorflow.js to train all types of datasets but there are lots of disadvantages, especially slow training, plus lots of missing features and libraries from python. Especially for dataset manipulation, I had to split some array values manually.

I wanted to use some of the Tensorflow.js features like CSV 'upload' but I also wanted something really simple that doesn’t require so much time and effort without interacting with server-side to load files or save files.

For me, building a breast cancer detection neural model was a challenge but I’m glad I finished this project. I learned a lot about how to use Tensorflow.js only with front-end and how to use functions to create models.

What's next for me is experimenting with different datasets, especially to create object detection using webcam or fruit dataset. I also plan to write about it in my next blog post, so I'd be glad to get any suggestions or feedback in the comments section below or via email here.

Author image

by Alex Donea

Software Engineer
  • Cluj-Napoca

Have an app idea? It’s in good hands with us.

Contact us
Contact us