Generate MNIST digit sequence images from MNIST dataset
The goal of this project is to write a program that can generate images representing sequences of numbers, for data augmentation purposes. These images would be used to train classifiers and generative deep learning models. These examples of generated images are helpful in inspecting the characteristics of the generated images and in inspecting the trained models behaviours.
This project uses MNIST dataset images to generate the image sequence. The MNIST dataset contains 60000 training images and 10000 test images of handwritten digits.
Though this project focuses only on generating sequence images from MNIST dataset, the code has been written in a way to extend it easily to any general-purpose image sequence generation tasks.
As mentioned above, image sequences have to be generated from the MNIST handwritten digits images and spacing them accordingly.
The exact requirements are as follows:
To generate an image of a sequence, the digits have to be stacked horizontally and the spacing between them should follow a uniform distribution over a range determined by two user specified numbers.
The numerical values of the digits themselves are provided by the user and each digit in the generated sequence is then chosen randomly from one of its representations in the MNIST dataset. The width of the output image in pixels is specified by the user.
DESIGN.md
in the docs/
directory.The project can either be installed as a pip package (API) OR can be run directly using the client script (client_script.py
) in the repository.
There are two ways to install this project as a package. They are:
.zip
), unzip it and then running the following command pip install .
from the project’s root directory. Another option is to run the command pip install image-sequence-0.0.1.zip
directly without unzipping it.pip install git+<REPO_HTTPS_LINK>
. For private repos, you may need to add the private key and then pip install using the repo’s SSH link. Please google for more information.Once installed, the package can be imported into any 3rd party programs (like an API) as shown below:
Usage examples:
Simple python script to generate digit sequence from DEFAULT configuration
from image_sequence.generator import DigitSequence
"""
Arguments:
digits: A list-like containing the numerical values of the digits from which the sequence will be generated (for example [3, 5, 0]).
spacing_range: a (minimum, maximum) pair (tuple), representing the min and max spacing between digits. Unit should be pixel.
image_width: specifies the width of the image in pixels.
"""
digits, spacing_range, image_width = range(10), (0, 20), 390
generator = DigitSequence()
gen_image = generator.generate_numbers_sequence(digits, spacing_range, image_width)
# Saves the generated image in the current directory
saved_filepath = generator.save_image(gen_image, digits)
Simple python script to generate digit sequence from CUSTOM configuration
from image_sequence.generator import DigitSequence
"""
Arguments:
digits: A list-like containing the numerical values of the digits from which the sequence will be generated (for example [3, 5, 0]).
spacing_range: a (minimum, maximum) pair (tuple), representing the min and max spacing between digits. Unit should be pixel.
image_width: specifies the width of the image in pixels.
"""
digits, spacing_range, image_width = range(10), (0, 20), 390
config_filepath = "./api_examples/example_config.json"
generator = DigitSequence(config_filepath)
gen_image = generator.generate_numbers_sequence(digits, spacing_range, image_width)
# Saves the generated image in the current directory
saved_filepath = generator.save_image(gen_image, digits)
The output image will have height of 28 pixels and width of image_width
(from input) and is saved in the directory the code was executed by default (but can give other paths as argument).
An example script (api_example.py
) and an example config file (example_config.json
) are added in the api_examples
directory for reference.
NOTE: Running any script (which imports the package) from the project root directory might import the
image_sequence
module in the project directory rather than the installed package. The results will be the same but just that the code in the module would be used and there would no point installing the package then.
Package dependencies:
The package is supported for Python>=3.5
.
The following modules are required to run the packages. These dependencies are automatically installed during the package installation. So, there is no need for the user to manually install these modules.
numpy
Pillow
imgaug
The project requires the Pillow
package to save the generated image and the imgaug
module (link) for digit images transformation. Also, the program uses cv2
(installed as a dependency for imgaug
) for image normalization. Please look at the configuration section (or technical docs) for more information regarding image transformation.
client_script.py
file in the repository can be used to generate image sequences. pip install -r requirements.txt
from root directory installs all required dependencies.python client_script.py <SEQUENCE> <MIN_SPACING> <MAX_SPACING> <IMAGE_WIDTH> --config(-c) <CONFIG_FILEPATH>
where:The command python client_script.py -h
can be run for additional information.
Usage examples:
Default config example:
Running the following command
python client_script.py 0,1 0 30 86
will save an image like this:
Custom config example:
python client_script.py 0,1 0 30 86 -c ./api_examples/example_config.json
between
, the program will raise an error since between
spacing does not make sense if the list has just 1 digit. This is valid for both fixed
and variable
spacing types.fixed
and the available pixels for adding spaces (obtained by subtracting space occupied by the image from image_width) cannot be divided equally among all the available spacings, then the program will generate an error. fixed
spacing case, even if the available pixels can be divided equally but the value doesnt lie within the spacing range, the program will raise an exception.spacing_range
and image_width
.(num_digits-1)
and (num_digits+1)
, then that image_width input will support both edge
and between
spacing types.variable
spacing types since the spacings might be huge between digits and after resizing to the image_width, the digits in the sequence might be squeezed.The project generates sequence images based on the provided configuration. These configurations can be written onto a json file and provided as input to the code. If no configuration file is provided, a default configuration (explained below) is used.
The possible config options are:
image
transform
:bool
true
, a set of transformations are applied to every digit image.spacing
spacing
attribute deals with options related to the spacings between digits in the image sequence.|| DIGIT_1 || DIGIT_2 ||
where ||
represent the possible spacing gaps in the image sequence. Hence, given n
digits, there can be a maximum of n+1
spacings (n-1
spacings inbetween digits and 2 edge spacings) in the final image sequence. Also, the spacing gaps can be of fixed length or varying lengths. The subattributes try to config the spacings depending on the given information.type
:str
fixed
, variable
subtype
:str
edge
, between
type
attribute. Given a spacing type, this attribute tells whether to consider the spacings just between images or even the edge spacings as well.output_format
:str
png
, jpg
, jpeg
jpg
format might reduce the final image_width (in the saved image) since jpg
uses lossless compression algorithm.If no config file is provided, the following configuration (default config) will be used:
{
"image": {
"transform": false
},
"spacing": {
"type": "variable",
"subtype": "between"
},
"output_format": "png"
}
NOTE: The problem statement specifies to sample space widths BETWEEN digits from a uniform distribution. Since the variable
spacing type, between
spacing subtype and false
transform satisfies this requirement, this is set as the default configuration.
There are several tests written to test the module.
Test dependencies:
The tox
module is used to run tests in various environments. Currently, the project is configured to be tested in py35
and py36
environments.
Running the tests:
The test requirements can be installed by running the command pip install -r requirements_test.txt
.
Once installed, the tests can be run using the command
tox
from the project’s root directory.
Note: tox
is very helpful when the project scales and coupled with continuous integration and deployment.
As mentioned in the summary, though this project focuses only on generating digit sequence from MNIST images, the project can be easily extended to generate any image sequence given the dataset and their labels (eg. ImageNet dataset).