Admission exercises

This file contains various exercises for the admission.

Debugging exercises

These exercises revolve around using the debugger to find bugs in code. They vary from using correct data types, implementation errors to errors found in real tutorials on the internet.

scientific_programming.fruit_id.id_to_fruit(fruit_id, fruits)

This method returns the fruit name by getting the string at a specific index of the set.

Parameters
  • fruit_id (int) – The id of the fruit to get

  • fruits (Set[str]) – The set of fruits to choose the id from

Return type

str

Returns

The string corrosponding to the index fruit_id

This method is part of a series of debugging exercises. Each Python method of this series contains bug that needs to be found.

1   It does not print the fruit at the correct index, why is the returned result wrong?
2   How could this be fixed?

This example demonstrates the issue: name1, name3 and name4 are expected to correspond to the strings at the indices 1, 3, and 4: ‘orange’, ‘kiwi’ and ‘strawberry’..

>>> name1 = id_to_fruit(1, {"apple", "orange", "melon", "kiwi", "strawberry"})
>>> name3 = id_to_fruit(3, {"apple", "orange", "melon", "kiwi", "strawberry"})
>>> name4 = id_to_fruit(4, {"apple", "orange", "melon", "kiwi", "strawberry"})
scientific_programming.swap.swap(coords)

This method will flip the x and y coordinates in the coords array.

Parameters

coords (ndarray) –

A numpy array of bounding box coordinates with shape [n,5] in format:

[[x11, y11, x12, y12, classid1],
 [x21, y21, x22, y22, classid2],
 ...
 [xn1, yn1, xn2, yn2, classid3]]

Returns

The new numpy array where the x and y coordinates are flipped.

This method is part of a series of debugging exercises. Each Python method of this series contains bug that needs to be found.

1   Can you spot the obvious error?
2   After fixing the obvious error it is still wrong, how can this be fixed?
>>> import numpy as np
>>> coords = np.array([[10, 5, 15, 6, 0],
...                    [11, 3, 13, 6, 0],
...                    [5, 3, 13, 6, 1],
...                    [4, 4, 13, 6, 1],
...                    [6, 5, 13, 16, 1]])
>>> swapped_coords = swap(coords)

The example demonstrates the issue. The returned swapped_coords are expected to have swapped x and y coordinates in each of the rows.

scientific_programming.plot_data.plot_data(csv_file_path)

This code plots the precision-recall curve based on data from a .csv file, where precision is on the x-axis and recall is on the y-axis. It it not so important right now what precision and recall means.

Parameters

csv_file_path (str) – The CSV file containing the data to plot.

This method is part of a series of debugging exercises. Each Python method of this series contains bug that needs to be found.

1   For some reason the plot is not showing correctly, can you find out what is going wrong?
2   How could this be fixed?

This example demonstrates the issue. It first generates some data in a csv file format and the plots it using the plot_data method. If you manually check the coordinates and then check the plot, they do not correspond.

>>> f = open("data_file.csv", "w")
>>> w = csv.writer(f)
>>> _ = w.writerow(["precision", "recall"])
>>> w.writerows([[0.013,0.951],
...              [0.376,0.851],
...              [0.441,0.839],
...              [0.570,0.758],
...              [0.635,0.674],
...              [0.721,0.604],
...              [0.837,0.531],
...              [0.860,0.453],
...              [0.962,0.348],
...              [0.982,0.273],
...              [1.0,0.0]])
>>> f.close()
>>> plot_data('data_file.csv')
scientific_programming.gan.train_gan(batch_size=32, num_epochs=100, device='cpu')

The method trains a Generative Adversarial Network and is based on: https://realpython.com/generative-adversarial-networks/

The Generator network tries to generate convincing images of handwritten digits. The Discriminator needs to detect if the image was created by the Generater or if the image is a real image from a known dataset (MNIST). If both the Generator and the Discriminator are optimized, the Generator is able to create images that are difficult to distinguish from real images. This is goal of a GAN.

This code produces the expected results at first attempt at about 50 epochs.

Parameters
  • batch_size (int) – The number of images to train in one epoch.

  • num_epochs (int) – The number of epochs to train the gan.

  • device (str) – The computing device to use. If CUDA is installed and working then cuda:0 is chosen otherwise ‘cpu’ is chosen. Note: Training a GAN on the CPU is very slow.

This method is part of a series of debugging exercises. Each Python method of this series contains bug that needs to be found.

It contains at least two bugs: one structural bug and one cosmetic bug. Both bugs are from the original tutorial.

1   Changing the batch_size from 32 to 64 triggers the structural bug.
2   Can you also spot the cosmetic bug?
Note: to fix this bug a thorough understanding of GANs is not necessary.

Change the batch size to 64 to trigger the bug with message: ValueError: “Using a target size (torch.Size([128, 1])) that is different to the input size (torch.Size([96, 1])) is deprecated. Please ensure they have the same size.”

>>> train_gan(batch_size=32, num_epochs=100)
scientific_programming.load_tests(loader, tests, ignore)