Deep learning training data contents—ArcGIS Pro

Available with Image Analyst license.

Deep learning training data that is created in ArcGIS Pro, using the Export Training Data tool, typically contains the following folders and files:

Images folder—Contains the image chips that were extracted from the source imagery and exported by the Export Training Data tool.
Labels folder—Contains the corresponding label for each image chip. Labels indicate the specific features or objects present in the image chip, such as buildings, roads, or trees.
esri_accumulated_stats.json file—Contains statistical information about the training data.
esri_model_definition.emd file—The Esri model definition (.emd) file contains information about the exported training data.
map.txt file—Lists the corresponding image chips and their respective labels to ensure the deep learning model can accurately associate each image with its correct label during training.
stats.txt file—Contains statistical information about the training data. It typically includes details such as images, features, features per image, classes, and class-specific statistics.

Folder structure of training data

Esri accumulated statistics file

The esri_accumulated_stats.json file contains statistical information about the training data that was exported. This information has the following key parameters:

Version—The version number of the file.
NumBands—The total number of spectral bands in the input images.
TileSizeX—The X dimension for the image chips.
TileSizeY—The Y dimension for the image chips.
NumClasses—The total number of object categories or classes.
NumTiles—The total number of image chips.
OutputFeatures—Specifies whether the model will be configured to output features or pixels. If the parameter is set to true, it will output features. If the parameter is set to false, it will output pixels.
MetaDataMode—The metadata format that is used for the labels. For example, for an object detection task, the type can be PASCAL_VOC_rectangles or KITTI_rectangles. For a list of available formats, see the Metadata Format parameter within the Export Training Data For Deep Learning tool.
MinCellSize—The minimum pixel size of the input raster and the spatial reference information.
MaxCellSize—The maximum pixel size of the input raster and the spatial reference information.
Classes—The list of classes, including their value, name, and color.
FeatureStats—The statistics about the features.
- NumImagesTotal—The total number of image chips.
- NumFeaturesTotal—The total number of features.
- NumImagesPerClass—The number of images per class.
- NumFeaturesPerClass—The number of features per class.
- NumFeaturesPerImage—The statistical information about the distribution of features per image, such as minimum, maximum, mean, sum, and count.
- FeatureAreaPerClass—The statistical information about the size of features per class, such as minimum, maximum, mean, sum, and count.
InputRastersProps—Information about the input raster, such as the raster count, the sensor name, and the band names.
- RasterCount—The number of bands in the input raster.
- SensorName—The sensor name for the input raster.
- BandNames—The band names for the input raster.
BandStatsState—The statistical information about each band in the input raster, such as minimum, maximum, mean, and standard deviation.

This file is primarily for internal use. Modifying this file manually is not recommended and may lead to unexpected results.

Esri model definition file

The Esri model definition (.emd) file contains information about the exported training data. This information has the following key parameters:

ImageHeight—The height dimension of the image chips.
ImageWidth—The width dimension of the image chips.
MetaDataMode—The metadata format that is used for the labels. For example, for an object detection task, the type can be PASCAL_VOC_rectangles or KITTI_rectangles. For a list of available formats, see the Metadata Format parameter within the Export Training Data For Deep Learning tool.
BlackenAroundFeature—Specifies whether the pixels around each object or feature in each image chip will be masked out. The possible values are true or false.
IsMultidimensional—Specifies whether the input data is multidimensional or time aware. The possible values are true or false.
CropTileMode—Specifies whether the exported tiles are cropped so that they are all the same size.
- Fixed size—The exported tiles are the same size and will center on the feature. This is the default.
- Bounding box—The exported tiles are cropped so that the bounding geometry surrounds only the feature in the tile.
MinCellSize—The minimum pixel size of the input raster and the spatial reference information.
MaxCellSize—The maximum pixel size of the input raster and the spatial reference information.
ImageSpaceUsed—The type of reference system used to create training data. The options are MAP_SPACE or PIXEL_SPACE.
Classes—The total number of different object categories or classes. Each class has the following information:
- Value—The unique numerical identifier for the class.
- Name—The name of the class.
- Color—The color code used to visualize the class in the output.
InputRastersProps—Information about the input raster, such as the raster count, the sensor name, and the band names.
- RasterCount—The number of bands in the input raster.
- SensorName—The sensor name for the input raster.
- BandNames—The band names for the input raster.
AllTilesStats—The statistical information about each image chip, such as minimum, maximum, mean, and standard deviation.

Older esri_model_definition.emd files may include additional optional parameters such as Framework, ModelConfiguration, ModelType, ModelFile, Description, ExtractBands, DataRange, ModelPadding, BatchSize, PerProcessGPUMemoryFraction, or WellKnownBandNames.

Map text file

The map.txt file lists the corresponding image chips and their respective labels, to ensure the deep learning model can accurately associate each image with its correct label during training.

Sample map.txt file

Statistics file

The stats.txt file contains statistical information about the training data. It typically includes details such as images, features, features per image, classes, and class-specific statistics:

images—Information about the image chips, such as the total number of image chips, the number of bands, and the dimensions information.
features—The total number of features in the images.
features per image—The statistical information about the distribution of features per image, the minimum, the mean, and the maximum values.
classes—The total number of different object categories or classes.
Class-specific statistics—Information for each class, such as the class name, class value, the number of images, the number of features, the minimum size, the mean size, and the maximum size of the objects belonging to that class.

Sample stats.txt file

Esri accumulated statistics file

Esri model definition file

Map text file

Statistics file

Related topics

In this topic