Frédérique Clément (INRIA), Raphäel Corre (CNRS), Céline Guigon (INSERM), François Caud, Benjamin Habert, Alexandre Gramfort (DATAIA, Univ. Paris-Saclay)
This challenge was done with the support of DATAIA in collaboration with INRIA, CNRS, INSERM and INRAE:
The challenge consists in automatically detecting and classifying ovarian follicles on histological slices from mammal ovaries.
The ovary is a unique instance of dynamic, permanently remodeling endocrine organ in adulthood. The ovarian function is supported by spheroid, multilayered, and multiphasic structures, the ovarian follicles, which shelter the oocyte (female germ cell) and secrete a variety of hormones and growth factors. The ovary is endowed with a pool of follicles established early in life, which gets progressively exhausted through follicle development or death. Understanding the dynamics of ovarian follicle populations is critical to characterize the reproductive physiological status of females, from birth (or even prenatal life) to reproductive senescence.
Accurate estimation of the number of ovarian follicles at various stages of development is of key importance in the field of reproductive biology, for basic research, pharmacological and toxicological studies, as well as clinical management of fertility. The associated societal challenges relate to physiological ovarian aging (decrease in fertility with age, menopause), pathological aging (premature ovarian failure) and toxic-induced aging (endocrine disruptors, anticancer treatments).
In vivo, only the terminal stages of follicles, hence the tip of the iceberg, can be monitored through ultrasonographic imaging. To detect all follicles, invasive approaches, relying on histology are needed. Ovaries are fixed, serially sectioned and stained with proper dyes, and manually analyzed by light microscopy. Such a counting is a complex, tedious, operator-dependent and, above all, very time consuming procedure. To save time, only some slices sampled from a whole ovary are examined, which adds to the experimental noise and degrades further the reliability of the measurements.
Experimentalists expect a lot from the improvement of the classical counting procedure, and deep-learning based approaches of follicle counting could bring a considerable breakthrough in the field of reproductive biology.
We will distinguish here 4 categories of follicles from smaller to larger follicles: Primordial, Primary, Secondary, Tertiary. One of the difficulties lies in the fact that there is a great disparity of size between all the follicles. Another one is that most of pre-trained classifiers are trained on daily life objets, not biological tissues.
Data consist of 34 images of histological sections taken on 6 mouse ovaries. 29 sections in the train dataset and 5 sections in the test dataset. Each section has been annotated with ground truth follicle locations and categories. Bounding box coordinates and class labels are stored in a csv file named labels.csv, one for each train and test set. A Negative class has also been created with bounding boxes of various sizes on locations where there is no positive example of follicles from the 4 retained categories.
# These modules and libraries must be imported to properly run the notebook
import os
import numpy as np
import pandas as pd
from PIL import Image
import matplotlib.pyplot as plt
import tensorflow as tf
import sklearn
from operator import itemgetter
Image.MAX_IMAGE_PIXELS = None
First uncomment the following line to download the data using this python script. It will create a data folder inside which will be placed the train and test data.
#!python download_data.py
In order to get a feel of what the data look like, let's visualize an image of a section and the corresponding annotations.
First we need to be able to read and extract bounding boxes coordinates and class names from the csv file.
# Display labels.csv as a pandas dataframe for train set
train_labels = pd.read_csv(os.path.join("data", "train", "labels.csv"))
train_labels.head(10)
Unnamed: 0 | filename | width | height | class | label | xmin | ymin | xmax | ymax | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | D-1M01-2.jpg | 2194 | 2112 | Primary | 2 | 1512 | 1187 | 1812 | 1373 |
1 | 1 | D-1M01-2.jpg | 2194 | 2112 | Primordial | 1 | 202 | 1095 | 270 | 1160 |
2 | 2 | D-1M01-2.jpg | 2194 | 2112 | Secondary | 3 | 362 | 1058 | 737 | 1404 |
3 | 3 | D-1M01-2.jpg | 2194 | 2112 | Secondary | 3 | 535 | 275 | 888 | 609 |
4 | 4 | D-1M01-2.jpg | 2194 | 2112 | Secondary | 3 | 977 | 307 | 1491 | 843 |
5 | 5 | D-1M01-3.jpg | 2935 | 2852 | Negative | 0 | 430 | 630 | 830 | 1046 |
6 | 6 | D-1M01-3.jpg | 2935 | 2852 | Negative | 0 | 455 | 1549 | 867 | 2002 |
7 | 7 | D-1M01-3.jpg | 2935 | 2852 | Negative | 0 | 640 | 519 | 785 | 600 |
8 | 8 | D-1M01-3.jpg | 2935 | 2852 | Negative | 0 | 1203 | 2599 | 1306 | 2677 |
9 | 9 | D-1M01-3.jpg | 2935 | 2852 | Negative | 0 | 1441 | 2693 | 1534 | 2746 |
This function extract boxes coordinates for true locations (ground truth annotations):
def load_true_locations(path, image_filename):
"""Return list of {bbox, class, proba}."""
df = pd.read_csv(path)
df = df.loc[df["filename"] == image_filename]
locations = []
for _, row in df.iterrows():
loc_dict = {
"bbox": (row['xmin'], row['ymin'], row['xmax'], row['ymax']),
"class": row["class"],
"proba": 1
}
locations.append(loc_dict)
return locations
Here is a function that diplays the image and bounding boxes:
def display_section_and_locations(section, locations):
linewidth=3
class_to_color = {
"Negative": "grey",
"Primordial": "blue",
"Primary": "red",
"Secondary": "green",
"Tertiary": "purple"
}
plt.figure(figsize=(20, 20))
plt.axis("off")
plt.imshow(section)
ax = plt.gca()
for location in locations:
box = location["bbox"]
class_ = location["class"]
proba = location["proba"]
color = class_to_color[class_]
text = "{}: {:.2f}".format(class_, proba)
linestyle = "-" if proba == 1 else "--"
x1, y1, x2, y2 = box
w, h = x2 - x1, y2 - y1
patch = plt.Rectangle(
[x1, y1], w, h, fill=False, edgecolor=color, linewidth=linewidth, linestyle=linestyle
)
ax.add_patch(patch)
ax.text(
x1,
y1,
text,
bbox={"facecolor": color, "alpha": 1},
clip_box=ax.clipbox,
clip_on=True,
)
plt.show()
this_folder = os.path.abspath("")
DATA_FOLDER = os.path.join(this_folder, "data")
MODELS_FOLDER = os.path.join(this_folder, "models")
IMAGE_TO_ANALYSE = 'D-1M02-2.jpg'
CLASSES = ['Negative', 'Primordial', 'Primary', 'Secondary', 'Tertiary']
CLASS_TO_INDEX = {
"Negative": 0,
"Primordial": 1,
"Primary": 2,
"Secondary": 3,
"Tertiary": 4,
}
INDEX_TO_CLASS = {value: key for key, value in CLASS_TO_INDEX.items()}
#section = Image.open(os.path.join(DATA_FOLDER, "train", IMAGE_TO_ANALYSE))
section = plt.imread(os.path.join(DATA_FOLDER, "train", IMAGE_TO_ANALYSE))
true_locations = load_true_locations(os.path.join(DATA_FOLDER, "train", 'labels.csv'), IMAGE_TO_ANALYSE)
Let's visualize one section from the training set with all the annotated true boxes:
display_section_and_locations(section, true_locations)
del(section) # to free some memory space
Feel free to change IMAGE_TO_ANALYSE name to display other examples.
Now, what is the size distribution of all the ground truth bounding boxes in the train set ? First we make a list of all those locations, and then we count by class and visualize histograms of the width of bounding boxes per class.
def all_true_locations(path):
labels = pd.read_csv(path)
all_locations = []
for _, row in labels.iterrows():
loc_dict = {
"bbox": (row['xmin'], row['ymin'], row['xmax'], row['ymax']),
"class": row["class"],
"proba": 1
}
all_locations.append(loc_dict)
return all_locations
# List of all locations in train set
all_locations_train = all_true_locations(os.path.join(DATA_FOLDER, "train", 'labels.csv'))
len(all_locations_train)
652
# Count of boxes per class
print('Number of boxes per class in the train set:')
for class_ in CLASSES:
box_count = [1 for location in all_locations_train if location['class']==class_]
print(f'{class_}: {len(box_count)}')
Number of boxes per class in the train set: Negative: 328 Primordial: 108 Primary: 75 Secondary: 97 Tertiary: 44
# Histograms of bboxes width
plt.figure(figsize=(15,15))
for i, class_ in enumerate(CLASSES):
width_list = [location['bbox'][2]-location['bbox'][0]
for location in all_locations_train if location['class']==class_
]
ax = plt.subplot(5, 1, i+1)
plt.hist(width_list)
plt.title(f'Width of boxes for {class_}')
We clearly see that the annotated follicles are arranged in this order of their size : Primordial < Primary < Secondary < Tertiary
The chosen strategy for the baseline algorithm is a random window cropping followed by a multiclass classification at each extracted window. First let's build and train the classifier. Here we take a pretrained model on Imagenet and freeze its weights. Then we only train the last fully-connected layer with the training data.
# Image shape for classifier
IMG_SHAPE = (224, 224, 3)
# Building of a classification model
base_model = tf.keras.applications.MobileNetV2(
input_shape=IMG_SHAPE, include_top=False, weights="imagenet"
)
base_model.trainable = False
inputs = tf.keras.Input(shape=IMG_SHAPE)
preprocess_input = tf.keras.applications.mobilenet_v2.preprocess_input
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = tf.keras.layers.Dense(5, activation="softmax")
x = preprocess_input(inputs)
x = base_model(x, training=False)
x = global_average_layer(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = prediction_layer(x)
model = tf.keras.Model(inputs, outputs)
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=["sparse_categorical_accuracy"],
)
model.summary()
Model: "model_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_4 (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ tf.math.truediv_1 (TFOpLambd (None, 224, 224, 3) 0 _________________________________________________________________ tf.math.subtract_1 (TFOpLamb (None, 224, 224, 3) 0 _________________________________________________________________ mobilenetv2_1.00_224 (Functi (None, 7, 7, 1280) 2257984 _________________________________________________________________ global_average_pooling2d_1 ( (None, 1280) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 1280) 0 _________________________________________________________________ dense_1 (Dense) (None, 5) 6405 ================================================================= Total params: 2,264,389 Trainable params: 6,405 Non-trainable params: 2,257,984 _________________________________________________________________
# Our model takes as input a tensor (M, 224, 224, 3) and
# for each image, a class encoded as an integer.
# Consequently we need to build these images from the files on disc
X_image_paths = [os.path.join(DATA_FOLDER, "train", filename)
for filename in train_labels["filename"].unique()
]
y_true_locations = [load_true_locations(os.path.join(DATA_FOLDER, "train", 'labels.csv'), filename)
for filename in train_labels["filename"].unique()
]
thumbnails = []
expected_predictions = []
for filepath, locations in zip(X_image_paths, y_true_locations):
print(f"reading {filepath}")
image = Image.open(filepath)
for loc in locations:
class_, bbox = loc["class"], loc["bbox"]
prediction = CLASS_TO_INDEX[class_]
expected_predictions.append(prediction)
thumbnail = image.crop(bbox)
thumbnail = thumbnail.resize((224, 224))
thumbnail = np.asarray(thumbnail)
thumbnails.append(thumbnail)
X_for_classifier = np.array(thumbnails)
y_for_classifier = np.array(expected_predictions)
reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M01-2.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M01-3.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M01-4.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M01-5.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M01-6.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M02-1.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M02-2.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M02-3.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M02-4.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M02-5.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M02-6.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M03-1.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M03-2.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M03-3.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M03-4.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M03-5.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M03-6.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M04-1.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M04-2.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M04-3.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M04-4.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M04-5.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M04-6.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M05-1.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M05-2.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M05-3.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M05-4.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M05-5.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/train/D-1M05-6.jpg
X_for_classifier.shape
(652, 224, 224, 3)
# Uncomment this line to fit the classifier
#model.fit(X_for_classifier, y_for_classifier, epochs=100)
# Uncomment this cell to Load a previously saved model
#MODEL_NAME_FOR_PREDICTION = "classifier3"
#model = tf.keras.models.load_model(os.path.join(MODELS_FOLDER, MODEL_NAME_FOR_PREDICTION))
def predict_image(image, model):
image = Image.fromarray(image)
image = image.resize((224,224))
image = np.array(image)
image = tf.reshape(image, (1,224,224,3))
pred = model.predict(image)
return np.argmax(pred), np.max(pred)
We load a test image and visualize it with ground truths:
IMAGE_TO_ANALYSE = 'D-1M06-1.jpg'
section = plt.imread(os.path.join(DATA_FOLDER, 'test', IMAGE_TO_ANALYSE))
true_locations = load_true_locations(os.path.join(DATA_FOLDER, "test", 'labels.csv'), IMAGE_TO_ANALYSE)
display_section_and_locations(section, true_locations)
Let's make some predictions on cropped images:
window = section[4350:5350, 3625:4625]
plt.imshow(window)
pred = predict_image(window, model)
pred
2021-12-02 19:40:40.310438: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
(4, 0.9675914)
The classifier predicts with 97% confidence level that this follicle is a Tertiary (index 4)
window = section[250:950, 4750:5400]
plt.imshow(window)
pred = predict_image(window, model)
pred
(0, 0.8082304)
Here it predicted 0 (Negative class) with 81% instead of a Secondary follicle.
window = section[1000:2000, 1500:2500]
plt.imshow(window)
pred = predict_image(window, model)
pred
(4, 0.9458151)
Finally, with 95% it predicted a Tertiary follicle (index 4)
del(section)
We will load all test data to evaluate the classifier.
test_labels = pd.read_csv(os.path.join("data", "test", "labels.csv"))
test_labels.head(10)
Unnamed: 0 | filename | width | height | class | label | xmin | ymin | xmax | ymax | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | D-1M06-1.jpg | 8227 | 5540 | Negative | 0 | 130 | 2642 | 1921 | 4493 |
1 | 1 | D-1M06-1.jpg | 8227 | 5540 | Negative | 0 | 1954 | 3272 | 3913 | 5259 |
2 | 2 | D-1M06-1.jpg | 8227 | 5540 | Negative | 0 | 2472 | 324 | 4085 | 1742 |
3 | 3 | D-1M06-1.jpg | 8227 | 5540 | Negative | 0 | 5167 | 3613 | 6553 | 4955 |
4 | 4 | D-1M06-1.jpg | 8227 | 5540 | Negative | 0 | 6480 | 2841 | 7706 | 3820 |
5 | 5 | D-1M06-1.jpg | 8227 | 5540 | Primordial | 1 | 542 | 2594 | 609 | 2711 |
6 | 6 | D-1M06-1.jpg | 8227 | 5540 | Primordial | 1 | 1282 | 1771 | 1338 | 1877 |
7 | 7 | D-1M06-1.jpg | 8227 | 5540 | Primordial | 1 | 1612 | 4489 | 1708 | 4546 |
8 | 8 | D-1M06-1.jpg | 8227 | 5540 | Primordial | 1 | 3770 | 176 | 3882 | 252 |
9 | 9 | D-1M06-1.jpg | 8227 | 5540 | Primordial | 1 | 3896 | 5261 | 3985 | 5300 |
Xtest_image_paths = [os.path.join(DATA_FOLDER, "test", filename)
for filename in test_labels["filename"].unique()
]
ytest_true_locations = [load_true_locations(os.path.join(DATA_FOLDER, "test", 'labels.csv'), filename)
for filename in test_labels["filename"].unique()
]
thumbnails = []
expected_predictions = []
for filepath, locations in zip(Xtest_image_paths, ytest_true_locations):
print(f"reading {filepath}")
image = Image.open(filepath)
for loc in locations:
class_, bbox = loc["class"], loc["bbox"]
prediction = CLASS_TO_INDEX[class_]
expected_predictions.append(prediction)
thumbnail = image.crop(bbox)
thumbnail = thumbnail.resize((224, 224))
thumbnail = np.asarray(thumbnail)
thumbnails.append(thumbnail)
Xtest_for_classifier = np.array(thumbnails)
ytest_for_classifier = np.array(expected_predictions)
reading /home/frcaud/ramp-kits/follicles_detection/data/test/D-1M06-1.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/test/D-1M06-2.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/test/D-1M06-3.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/test/D-1M06-4.jpg reading /home/frcaud/ramp-kits/follicles_detection/data/test/D-1M06-5.jpg
We use model.evaluate with test data to evaluate our model
loss, accuracy = model.evaluate(Xtest_for_classifier, ytest_for_classifier)
print('Test accuracy :', accuracy)
3/3 [==============================] - 1s 172ms/step - loss: 0.5264 - sparse_categorical_accuracy: 0.7857 Test accuracy : 0.7857142686843872
We can save the model with this line:
# save model
#model.save(os.path.join(MODELS_FOLDER, "classifier4"))
Here we make predictions on the whole test set:
preds = model.predict(Xtest_for_classifier)
preds = np.argmax(preds, axis=1)
preds
array([0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 2, 0, 1, 3, 3, 0, 0, 0, 0, 0, 0, 0, 2, 1, 1, 2, 1, 1, 1, 1, 1, 4, 0, 0, 2, 0, 0, 4, 1, 3, 3, 0, 0, 0, 0, 0, 1, 0, 0, 1, 3, 3, 3])
ytest_for_classifier
array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 3, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 1, 1, 1, 3, 3, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 1, 1, 1, 1, 1, 3, 0, 0, 0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 0, 2, 1, 1, 1, 1, 3, 3, 3])
Confusion matrix:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
con_mat = confusion_matrix(ytest_for_classifier, preds)
disp = ConfusionMatrixDisplay(confusion_matrix=con_mat)
disp.plot()
plt.show()
Classification report:
from sklearn.metrics import classification_report
print(classification_report(ytest_for_classifier, preds, target_names=["Negative", "Primordial", "Primary", "Secondary", "Tertiary"]))
precision recall f1-score support Negative 0.81 0.94 0.87 32 Primordial 0.81 0.68 0.74 19 Primary 0.60 0.43 0.50 7 Secondary 0.86 0.75 0.80 8 Tertiary 0.60 0.75 0.67 4 accuracy 0.79 70 macro avg 0.74 0.71 0.72 70 weighted avg 0.78 0.79 0.78 70
The strategy is to generate random windows on test images and pass them through the classifier. We can choose different mean (window_size) and the window size will be drawn from a normal distribution with this mean.
First we create a list of random bounding boxes of certain sizes: one bbox = (xmin, ymin, xmax, ymax).
def generate_random_windows_for_image(image, window_sizes, num_windows):
"""create list of bounding boxes of varying sizes
Parameters
----------
image : np.array
window_sizes : list of int
exemple [200, 1000, 2000]
sizes of windows to use
num_windows : list of int
example [1000, 100, 100]
how many boxes of each window_size should be created ?
"""
assert len(window_sizes) == len(num_windows)
image_height, image_width, _ = image.shape
all_boxes = []
for size, n_boxes in zip(window_sizes, num_windows):
mean = size
std = 0.15 * size
for _ in range(n_boxes):
width = np.random.normal(mean, std)
x1 = np.random.randint(0, image_width)
y1 = np.random.randint(0, image_height)
bbox = (x1, y1, x1 + width, y1 + width)
all_boxes.append(bbox)
return all_boxes
Then we build a tensor of cropped images, all 224x224x3 in size:
def build_cropped_images(image, boxes, crop_size):
"""Crop subimages in large image and resize them to a single size.
Parameters
----------
image : np.array of shape (height, width, depth)
boxes : list of tuple
each element in the list is (xmin, ymin, xmax, ymax)
crop_size : tuple(2)
size of the returned cropped images
ex: (224, 224)
Returns
-------
cropped_images : tensor
example shape (N_boxes, 224, 224, 3)
"""
height, width, _ = image.shape
images_tensor = [tf.convert_to_tensor(image)]
boxes_for_tf = [
(y1 / height, x1 / width, y2 / height, x2 / width) for x1, y1, x2, y2 in boxes
]
box_indices = [0] * len(boxes_for_tf)
cropped_images = tf.image.crop_and_resize(
images_tensor,
boxes_for_tf,
box_indices,
crop_size,
method="nearest",
extrapolation_value=0,
name=None,
)
return cropped_images
This function will create a list of locations {"class": , "proba": , "bbox": } from probas and boxes obtained at prediction time.
def convert_probas_to_locations(probas, boxes):
"""
create list of locations: list of dict
location: dict(class, proba, bbox)
"""
top_index, top_proba = np.argmax(probas, axis=1), np.max(probas, axis=1)
predicted_locations = []
for index, proba, box in zip(top_index, top_proba, boxes):
if index != 0:
predicted_locations.append(
{"class": INDEX_TO_CLASS[index], "proba": proba, "bbox": box}
)
return predicted_locations
Next function takes an image of a section, a list of box sizes and a list of number of boxes for every size and returns a list of predicted locations:
def predict_single_image(image_path, boxes_sizes=[1500,500,150], boxes_num = [200,500,2000]):
image = plt.imread(image_path)
boxes = generate_random_windows_for_image(image, boxes_sizes, boxes_num)
cropped_images = build_cropped_images(
image, boxes, crop_size=IMG_SHAPE[0:2]
)
predicted_probas = model.predict(cropped_images)
predicted_locations = convert_probas_to_locations(predicted_probas, boxes)
return predicted_locations
IMAGE_TO_ANALYSE = 'D-1M06-5.jpg'
image_path = os.path.join(DATA_FOLDER, 'test', IMAGE_TO_ANALYSE)
section = plt.imread(image_path)
true_locations = load_true_locations(os.path.join(DATA_FOLDER, "test", 'labels.csv'), IMAGE_TO_ANALYSE)
display_section_and_locations(section, true_locations)
predicted_locations = predict_single_image(image_path, boxes_sizes=[600,500], boxes_num=[2000,2000])
2021-12-02 19:43:59.660728: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 2408448000 exceeds 10% of free system memory.
Then we filter those predictions with a probability threshold:
def filter_predictions(predicted_locations, proba_threshold=0.8):
def should_keep_prediction(class_, proba):
if class_ == "Negative":
return False
if proba < proba_threshold:
return False
return True
selected_locations = [
prediction
for prediction in predicted_locations
if should_keep_prediction(prediction["class"], prediction["proba"])
]
return selected_locations
selected_locations = filter_predictions(predicted_locations, proba_threshold=0.80) # change the threshold from 0. to 1
print(len(selected_locations))
13
display_section_and_locations(section, selected_locations)
def compute_iou(boxes1, boxes2):
"""Computes pairwise IOU matrix for given two sets of boxes
Arguments:
boxes1: A tensor with shape `(N, 4)` representing bounding boxes
where each box is of the format `[x, y, x2, y2]`.
boxes2: A tensor with shape `(M, 4)` representing bounding boxes
where each box is of the format `[x, y, xmax, ymax]`.
Returns:
pairwise IOU matrix with shape `(N, M)`, where the value at ith row
jth column holds the IOU between ith box and jth box from
boxes1 and boxes2 respectively.
"""
lu = np.maximum(boxes1[:, None, :2], boxes2[:, :2])
rd = np.minimum(boxes1[:, None, 2:], boxes2[:, 2:])
intersection = np.maximum(0.0, rd - lu)
intersection_area = intersection[:, :, 0] * intersection[:, :, 1]
boxes1_area = (boxes1[:,2] - boxes1[:, 0]) * (boxes1[:,3] - boxes1[:, 1])
boxes2_area = (boxes2[:,2] - boxes2[:, 0]) * (boxes2[:,3] - boxes2[:, 1])
union_area = np.maximum(
boxes1_area[:, None] + boxes2_area - intersection_area, 1e-8
)
return np.clip(intersection_area / union_area, 0.0, 1.0)
def NMS(selected_locations, iou_threshold=0.4):
selected_locations_NMS = []
selected_locations_sorted = list(sorted(selected_locations, key=itemgetter('proba'), reverse=True))
# boxes_sorted = np.array([location['bbox'] for location in selected_locations_sorted])
while len(selected_locations_sorted) !=0:
best_box = selected_locations_sorted.pop(0)
selected_locations_NMS.append(best_box)
best_box_coords = np.array(best_box["bbox"]).reshape(1,-1)
other_boxes_coords = np.array([location['bbox'] for location in selected_locations_sorted]).reshape(-1,4)
ious = compute_iou(best_box_coords, other_boxes_coords)
for i, iou in reversed(list(enumerate(ious[0]))):
if iou > iou_threshold:
selected_locations_sorted.pop(i)
return selected_locations_NMS
selected_locations_NMS = NMS(selected_locations, iou_threshold=0.05)
len(selected_locations_NMS)
4
selected_locations_NMS
[{'class': 'Secondary', 'proba': 0.95085704, 'bbox': (1035, 1542, 1545.5596415035188, 2052.559641503519)}, {'class': 'Secondary', 'proba': 0.9482151, 'bbox': (6198, 2411, 6795.5972059921005, 3008.5972059921)}, {'class': 'Secondary', 'proba': 0.9461027, 'bbox': (2040, 540, 2456.0433814012827, 956.0433814012828)}, {'class': 'Secondary', 'proba': 0.83203113, 'bbox': (179, 3565, 678.4266293253669, 4064.426629325367)}]
display_section_and_locations(section, selected_locations_NMS)
del(section)
def find_matching_bbox(bbox, list_of_bboxes, iou_threshold):
"""
Return index, success
index = index of bbox with highest iou
success = if matching iou is greater than threshold
"""
ious = compute_iou(np.array([bbox]), np.array(list_of_bboxes))[0, :]
index, maximum = np.argmax(ious), np.max(ious)
return index, maximum > iou_threshold
def compute_precision_recall(true_locations, predicted_locations, iou_threshold=0.3):
classes = [
"Primordial",
"Primary",
"Secondary",
"Tertiary",
]
precisions = {}
recalls = {}
thresholds = {}
for predicted_class in classes:
true_boxes = [
location["bbox"]
for location in true_locations
if location["class"] == predicted_class
]
if not true_boxes:
continue
pred_boxes = [
(location["bbox"], location["proba"])
for location in sorted(predicted_locations, key=lambda loc: loc["proba"], reverse=True)
if location["class"] == predicted_class
]
precision = []
recall = []
threshold = []
n_positive_detections = 0
n_true_detected = 0
n_true_to_detect = len(true_boxes)
for i, (pred_bbox, proba) in enumerate(pred_boxes):
if len(true_boxes) > 0:
index, success = find_matching_bbox(pred_bbox, true_boxes, iou_threshold)
if success:
true_boxes.pop(index)
n_positive_detections += 1
n_true_detected += 1
threshold.append(proba)
precision.append(n_positive_detections / (i +1))
recall.append(n_true_detected / n_true_to_detect)
precisions[predicted_class] = precision
recalls[predicted_class] = recall
thresholds[predicted_class] = threshold
return precisions, recalls, thresholds
from sklearn.metrics import PrecisionRecallDisplay
def display_metrics(precisions, recalls):
classes = [
"Primordial",
"Primary",
"Secondary",
"Tertiary",
]
colors = ["navy", "turquoise", "darkorange", "cornflowerblue", "teal"]
_, ax = plt.subplots(figsize=(7, 8))
APs = {}
for class_to_predict, color in zip(classes, colors):
try:
precision = [1] + precisions[class_to_predict]
recall = [0] + recalls[class_to_predict]
average_precision = 0
for i in range(1, len(precision)):
average_precision += precision[i] * (recall[i] - recall[i-1])
APs[class_to_predict] = average_precision
display = PrecisionRecallDisplay(
recall=recall,
precision=precision,
average_precision=average_precision,
# linestyle="--"
)
display.plot(ax=ax, name=f"Precision-recall for class {class_to_predict}", color=color, marker="o")
except KeyError:
pass
plt.show(ax)
return APs
precisions, recalls, thresholds = compute_precision_recall(true_locations=true_locations, predicted_locations=selected_locations_NMS, iou_threshold=0.1)
precisions, recalls, thresholds
({'Primordial': [], 'Primary': [], 'Secondary': [1.0, 1.0, 1.0, 0.75]}, {'Primordial': [], 'Primary': [], 'Secondary': [0.3333333333333333, 0.6666666666666666, 1.0, 1.0]}, {'Primordial': [], 'Primary': [], 'Secondary': [0.95085704, 0.9482151, 0.9461027, 0.83203113]})
APs = display_metrics(precisions, recalls)
print(APs)
{'Primordial': 0, 'Primary': 0, 'Secondary': 1.0}
IMAGES_TO_ANALYSE = ['D-1M06-1.jpg', 'D-1M06-2.jpg', 'D-1M06-3.jpg', 'D-1M06-4.jpg', 'D-1M06-5.jpg']
#for IMAGE_TO_ANALYSE in IMAGES_TO_ANALYSE:
# section = Image.open(os.path.join(DATA_FOLDER, 'test', IMAGE_TO_ANALYSE))
# true_locations = load_true_locations(os.path.join(DATA_FOLDER, "test", 'labels_resized.csv'), IMAGE_TO_ANALYSE)
# display_section_and_locations(section, true_locations)
true_locations_test = []
predicted_locations_test = []
for IMAGE_TO_ANALYSE in IMAGES_TO_ANALYSE:
print(f"predict image {IMAGE_TO_ANALYSE}")
# load image
#section = plt.imread(os.path.join(DATA_FOLDER, 'test', IMAGE_TO_ANALYSE))
# adding true locations to list for all test images
true_locations_test += load_true_locations(os.path.join(DATA_FOLDER, "test", 'labels.csv'), IMAGE_TO_ANALYSE)
# predicted locations for each image
predicted_locations = predict_single_image(os.path.join(DATA_FOLDER, 'test', IMAGE_TO_ANALYSE), boxes_sizes=[1000,750,500], boxes_num = [1000,1000,1000])
selected_locations = filter_predictions(predicted_locations, proba_threshold=0.80)
selected_locations_NMS = NMS(selected_locations, iou_threshold=0.2)
# adding predicted locations to list for all test images
predicted_locations_test += selected_locations_NMS
predict image D-1M06-1.jpg
2021-12-02 19:48:49.537842: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 1806336000 exceeds 10% of free system memory.
predict image D-1M06-2.jpg
2021-12-02 19:49:24.649793: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 1806336000 exceeds 10% of free system memory.
predict image D-1M06-3.jpg
2021-12-02 19:50:01.561389: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 1806336000 exceeds 10% of free system memory.
predict image D-1M06-4.jpg predict image D-1M06-5.jpg
len(true_locations_test)
70
len(predicted_locations_test)
17
precisions, recalls, thresholds = compute_precision_recall(true_locations=true_locations_test, predicted_locations=predicted_locations_test, iou_threshold=0.3)
precisions, recalls, thresholds
({'Primordial': [], 'Primary': [], 'Secondary': [1.0, 1.0, 0.6666666666666666, 0.75, 0.6, 0.5, 0.5714285714285714, 0.5, 0.4444444444444444, 0.4, 0.36363636363636365, 0.3333333333333333, 0.3076923076923077], 'Tertiary': [1.0, 1.0, 0.6666666666666666, 0.75]}, {'Primordial': [], 'Primary': [], 'Secondary': [0.125, 0.25, 0.25, 0.375, 0.375, 0.375, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], 'Tertiary': [0.25, 0.5, 0.5, 0.75]}, {'Primordial': [], 'Primary': [], 'Secondary': [0.9715337, 0.9687936, 0.96105874, 0.95649874, 0.9516881, 0.94534045, 0.9225212, 0.91210365, 0.8728383, 0.86458933, 0.852749, 0.83463573, 0.83233446], 'Tertiary': [0.94692034, 0.9313819, 0.8714523, 0.81594306]})
APs = display_metrics(precisions, recalls)
print(APs)
{'Primordial': 0, 'Primary': 0, 'Secondary': 0.4151785714285714, 'Tertiary': 0.6875}
# mean Average Precision
classes = [
"Primordial",
"Primary",
"Secondary",
"Tertiary",
]
APs_list = [APs[cl] for cl in classes]
APs_list = np.array(APs_list)
mAP = APs_list.mean()
print(f"mAP: {mAP}")
mAP: 0.27566964285714285
Your proposition should at least beat this mAP score.
You can test any submission locally by running:
ramp-test --submission <submission folder>
If you want to quickly test the that there are no obvious code errors, use the --quick-test
flag to only use a small subset of the data.
ramp-test --submission <submission folder> --quick-test
See the online documentation for more details.