AI in fracture detection- A technical perspective to solve the problem

5 min readJan 26, 2021

AI in fracture detection- A technical perspective to solve the problem

Globally, fractures are a common reason for hospital admission, including admissions due to missed diagnosis or undetected diagnosis. The healthcare systems across the planet are under pressure due to lack of resources (radiologists, devices or technical-infra) which impact the quality and outcomes of care delivery.

In the last few decades, technology has progressively assisted in medical diagnosis and could be the rainmaker in future with the advent of Artificial intelligence assistance in care delivery. The prominent impact of AI could be the prioritisation and segregation between critical and non-critical cases across all healthcare settings and medical management.

Detecting and prioritising a fracture with AI assistance could not only ease the burden of radiologist or health system but also enable clinicians to spend more time on critical cases where a slight delay could be fatal.

However, there are a few technical aspects to be addressed in developing an AI solution.

Architecture to detect bone-fracture

An AI-enabled system that can auto-detect/localized and prioritise the region of broken-bone/bone-fracture encountered many challenges. Some key challenges are:

Challenge 1: Using Computer Vision, detection of fracture is a challenging task, considering the various types of fractures those are small in size, a wide range of different shaped and hard to detect.

Solution: Problem is more centric towards precision, hence algorithm must be configured to serve two tasks, first is to identify that fracture is present in the given image or not. Second, is to locate the fracture, using the bounding box, if the fracture is present.

Challenge 2: ‘Region Proposal Network’ is to handle objects of very different scales in a real-life scenario.

Solution: This problem can be tackled by training a set of RPN for various scales. Each RPN will take different convolution layer or set of layers as input so the receptive field will of different size. This significantly improves the detection of both small and large object.

Challenge 3: Which architecture can serve the best job?

AI is a science where persistence and continuous experimentation gives results.

- Single Shot multi-box Detector (SSD) is fastest with mobileNet base model and support low resolution of the image (hence, Input image should be of low resolution). Hence, SSD based algorithms struggle especially in detecting the small objects like bones fractures.

- The most accurate model is Faster R-CNN with its complicated inception Resnet-based architecture and 300 proposals per image.

- Hot spot, where we reach a balance between precision and speed are, Faster R-CNN with Resnet Architecture having 100 proposals. Since it supports higher resolution images to process, it is easy for the algorithm to find out abnormality from the image which is a considerable factor when we have to deal with the small size of fracture.

- Resnet module is also beneficial in case of deep feature extraction where an increase in a number of layers does not lead to the cause of Over-fitting.

Architectural Workflow — fracture detection system

A computer vision algorithm having Faster R-CNN base with Resnet under Tensorflow and Keras Framework is used to classify as well as localized fractured area within the X-ray image. Several base architectures were used (in experimental phase) such as VGG, mobileNet, inception v2 and inception v3 before to finalize the Inception Resnet v2 to serve as base architecture (having highest “mean Average Precision — mAP” rate among all tried algorithms) with Faster R-CNN. This finalized architecture is capable to even detect minor abnormality in the processed image and ensure us that deep neural network can provide an effective way to identify the fracture in X-ray images, precisely.

It is a state of art detection network, mainly consists of three parts: first part take-cares of classification and generates feature map (Inception Resnet v2, convolution deep neural network is used here as based architecture). The second part is a regional proposal network for generation region proposals and third is regressor, for detecting the precise location of each object and its classification. In this architecture, finer localization information has been encoded in the channels of convolution feature response and Region Proposal Network (RPN) slide a small window on the feature map to build a small network for classifying fractured pattern/object or non-fractured. Regressing bounding box locations and the position of the sliding window provides localization information with reference to the image. Box regression provides finer localization information with reference to this sliding window and at each sliding window position a set of k-object proposals is defined and each proposal has different size and aspect ratio. Such proposals are called anchors which improve the handling of objects of different sizes and aspect ratio. Further, RPN keeps anchors (if intersection over union is larger than 0.7 or) that pass the threshold in a way that has the highest intersection over union (IOU) with the ground truth box.

About Loss Function: Faster R-CNN is optimized for multi-task loss function that combines the losses of bounding box regression and classification. Especially Regression Loss where Output of regression computes the predicted bounding box and regression loss signifies the offset from the true bounding box. In the training of RPN Box regression with reference to anchors, propagation and stochastic gradient descent (SGD) play an important role where new layers are initialized through zero-mean Gaussian distributions and rest other layers is initialized using a pre-trained network of classification.

L({Pi }, {ti }) =

Pi, Pit) + λ

PitLreg(ti, tit)

Symbol “i” represents the index of anchors in mini-batch and Pi stands for predicted probabilities of the anchor being an object. The pit is 1 if the anchor is positive and 0 in case of the negative anchor. Here, it is a vector which represents the four parameterized coordinates of the predicted bounding box. Also, it denotes the ground truth vector of the coordinates associated with positive anchors. Symbol “L” denotes classification log loss over two classes such as object vs. no object.

What lies ahead

By 2030, AI will change the way health systems across the planet operate and deliver care to patients, covering all specialities. Technology will assist health systems to focus on “predictive, preventive and curative” care with better outcomes.

[A write-up by iDoc.ai (https://idoc.ai/) AI researcher — Sunny Dhankar ]

Written by Rahul Tarar