This work presents a modular software pipeline and end-to-end workflow for video-based animal re-identification, which assigns consistent individual IDs by clustering multiview spatio-temporal tracks with minimal human intervention. Starting from raw video, the system detects and tracks animals, scores and selects informative left/right views, computes embeddings, clusters annotations by viewpoint, and then links clusters across time and varying perspectives using spatio-temporal continuity. Automated consistency checks resolve remaining ambiguities. Preliminary experiments demonstrate near-perfect identification accuracy with very limited manual verification. The workflow is designed to be generalizable across species. Currently, trained models support Grevy’s and Plains zebras, with plans to expand to a broader range of species.
- Software
- Animal-Ecology
- Animal Re-Identification (re-id): The process of determining if an animal has been seen before by matching it against a database of images with known identity labels. The paper addresses this problem in the context of long video sequences.
- Multiview Spatio-Temporal Track Clustering: A novel framework introduced for animal re-identification. It works by clustering tracked animal detections from different viewpoints (multiview) and across time (spatio-temporal) to correctly identify individuals.
- Identifiable Annotation (IA): An annotation, or detected animal image, that contains sufficient distinguishing information for reliable individual identification. For Grévy's zebras, an IA must show both the hip and chevron patterns on either the left or right side.
- Human-in-the-loop: The involvement of human decisions to confirm animal identities when the automated system is uncertain or to correct algorithmic errors.
The following is a simplified hierarchy of the file structure using in this repository.
VAREID
├── algo
| ├── detection/
| ├── frame_sampling/
| ├── ia_classification/
| ├── import/
| ├── lca/
| ├── miew_id/
| ├── postprocessing/
| ├── species_identification/
| └── viewpoint_classification/
├── drivers/
├── models/
├── tools/
├── libraries
| ├── db/
| ├── io/
| ├── logging/
| ├── ui/
| └── constants.py
├── config.yaml
├── environment.yaml
└── snakefile.smk
The repository can generally be split into four groups of code:
Algorithm components are the invidual steps of the pipeline, such as detection or species classification. They are contained in VAREID/algo/[component_name]/ in separate directories. In some specific cases, two components may share the same directory, such as video_detector.py and image_detector.py. Their only dependency (within this repository) would be library functions. Every component here should have an executable script to run that step of the pipeline.
For more information on each algorithm component, please view the README files in each of their corresponding directories. Their arguments are also documented via the argparse python library.
The pipeline's workflow is built using Snakemake. The workflow is defined in snakefile.smk, via executions to driver scripts.
The snakefile reads in a configfile structured like config.yaml. To build a configfile, please follow the notations found in the example config.yaml.
Driver scripts serve as connectors between the pipeline and the algorithm components. They handle determining conditional arguments passed to algorithm components (such as flags or variations in parameters based on image vs. video mode), setting up logging, building the command, and executing the algorithm component. Every algorithm component must have a driver script associated with it.
The libraries contain all util functions used throughout the pipeline. These libraries range from database operations (e.g. image tables and directories), IO (image/video importing, loading/saving data, etc.), logging, UI, and more.
In addition to the above structure, there's a few more important directories to note.
All models are stored in the VAREID/models/ directory. This primarily includes the .pth files for the viewpoint classifier and IA classifier models. It also includes the verifiers probabilities used by LCA, but this is being phased out.
This directory contains some prototype tools that provide convenience and extra functionality to users. visualize.py is a script that draws and labels specific annotations. extrapolate_ggr_gps.py extrapolates GPS data for images missing it, which is specific functionality for images taken by the same camera with timestamp data.
This is the file defining the python environemnt requirements for this repository. Use this file with a package manager like conda to build an environment. More on this in the How-To section.
The following is a flowchart describing the workflow of the pipeline, along with the associated driver script for each stage.
flowchart LR
subgraph s1["data_video == False"]
n15["<b>Miew-Id</b><br>mid_driver.py"]
n20["<b>LCA</b><br>lca_driver.py"]
end
subgraph s2["data_video == True"]
n16["<b>Frame Sampling</b><br>fs_driver.py"]
n17["<b>Miew-Id</b><br>mid_driver.py"]
n18["<b>LCA</b><br>lca_driver.py"]
end
subgraph s3["data_video == False"]
n21@{ label: "<b><span style=\"--tw-scale-x:\">Image Importer</span><br style=\"--tw-scale-x:\"></b>import_image_detector.py" }
n25@{ label: "<span style=\"--tw-scale-x:\"><b>Image Detector<br style=\"--tw-scale-x:\"></b></span>dt_image_driver.py" }
end
subgraph s4["data_video == True"]
n22["<b>Video Importer<br></b>import_video_detector.py"]
n24["<b>Video Detector<br></b>dt_video_driver.py"]
end
n6["<b>Species Classifier</b><br>si_driver.py"] --> n7["<b>Viewpoint Classifier<br></b>vc_driver.py"]
n7 --> n8["<b>IA Classifier<br></b>iac_driver.py"]
n8 --> n10["<b>IA Filterer</b><br>iaf_driver.py"]
n10 --> n16 & n15
n16 --> n17
n16 -- annotations --> n18
n17 -- embeddings --> n18
n15 -- embeddings --> n20
n10 -- annotations --> n20
n22 --> n24
n21 --> n25
n24 --> n6
n25 --> n6
n18 --> n26["<b>Postprocessing</b><br>post_driver.py"]
n16@{ shape: rect}
n21@{ shape: rect}
n25@{ shape: rect}
n7@{ shape: rect}
n8@{ shape: rect}
n26@{ shape: rect}
style n26 stroke-width:4px,stroke-dasharray: 5
One important detail to note immediately is that postprocessing is external from the pipeline's workflow! This section, as will be explained below, requires human interaction and thus is not automatically ran by the pipeline. It is run separately and for video data only.
The pipeline's input is any recursive directory structure. For image mode, the pipeline will read in ALL images within the provided directory and its child directories. For video mode, we will read all videos. When running the pipeline on videos, each video must have a matching (same file name) SRT file located in the same directory. In other words, the absolute paths to the video and SRT file only differ by their file extension. Each entry of the SRT file should be formatted like the following:
1
00:00:00,000 --> 00:00:00,033
<font size="36">SrtCnt : 1, DiffTime : 33ms
2023-01-19 11:48:31,795,565
[iso : 100] [shutter : 1/2000.0] [fnum : 280] [ev : 0] [ct : 4823] [color_md : default] [focal_len : 224] [latitude: 0.386694] [longitude: 36.893198] [altitude: 23.900000]
</font>
Importing's main goal is generating the image_data.json or video_data.json file describing each image (or frame for videos) in terms of metadata, including the absolute path to the image. For videos, this also includes splitting and saving the video into frames as well as parsing an SRT file to assign timestamps to frames.
Detection uses YOLO to create detections for all images in the json files from above. Video detection also generates tracking IDs for each detection. The detections are saved as annotations.
The species of each annotation is generated via Bioclip. For now, this includes Grevys Zebras, Plains Zebras, or neither.
The viewpoint of each annotation is generated. The viewpoint is a combination of the following classifiers: [up, front, back, left, right].
Each annotation is assessed for its quality and ability to be identified. They are assigned a score and assigned a boolean for whether they are identifiable or not based on a threshold.
This step filters out all annotations that were marked as not identifiable and simplifies the viewpoint to left or right.
This is a video only process. This step further filters annotations by performing non-maximum supression over sets of consecutive tracking ids, maximizing the score from IA classification.
This step generates embeddings for all remaining annotations.
This step clusters the annotations by their embeddings and assigns cluster ids.
Applies final consistency checks, resolves cluster overlaps, handles manual verification when needed, assigns final unique IDs, and integrates non-identifiable annotations via tracking links.
This section walks through how to use this repository and its features. It is split into sections based on the types of tasks you're looking to accomplish.
This pipeline must be run in a Linux-based conda environment. You'll need to setup a python envionment that meets the requirements layed out in environment.yaml. There's several package managers that revolve around conda as well as the more-efficient reimplementation mamba. Pick your favorite and use its documentation to set up an environment. To setup the environment, you'll need to do a command similar to the following:
From the parent directory...
| Package Manager | Command |
|---|---|
| conda (Miniconda/Anaconda) | conda env create -n [env name] -f environment.yaml |
| mamba | mamba create -n [env name] -f environment.yaml |
The choice of what package manager to use is up to you.
The commands to activate your envioronment are as follows:
| Package Manager | Command |
|---|---|
| conda (Miniconda/Anaconda) | conda activate [env name] |
| mamba | mamba activate [env name] |
Please follow the instructions provided by comments in config.yaml. You can directly edit and use this file if you wish, but we highly recommend filling out a copy. This way, you can save the configs for each experiment and refer to them later (or run several experiments at once).
Unless you'd like to customize the exact output filenames and directories, the following config fields matter the most:
The following fields are required:
data_dir_out: This is the output directory to save to.data_dir_in: This is the input directory to read to.data_video: This is a boolean (True/False) specifying whether to process image or video data.
The following fields are optional and either have default (recommended) values already in the configfile or are blank (fully optional):
dt_gt_fileanddt_filtered_out_file: In the case that you're running image data with ground truth data, you can find and filter detections by IOU (Intersection over Union) with the ground truth detections.fs_stage1_out_file: This field, if supplied, will save an additional output from frame sampling after its first stage.lca_separate_viewpoints: This field specifies whether to split and save annotation files by each viewpoint or to save them alltogether. In video mode, this MUST be True!
To run the pipeline, you'll execute snakefile.smk. Remember: the pipeline does NOT run postprocessing. This is run separately.
Please run the snakefile from the parent directory in this repository. For more information on how to run a snakefile (e.g. available flags), please view the Snakemake Docs. The most important flags you'll need to specify are as follows:
| Flag | Function |
|---|---|
| -s | The path to the snakefile, which should be snakefile.smk |
| --cores | The number of CPU cores you'd like to run on. |
| --configfile | The path to the configfile you're using. Defaults to config.yaml if not provided. |
Put it together, your command will look like the following:
snakemake -s snakefile.smk --cores 1 --configfile path/to/your_config.yaml
Note that your configfile can be supplied by any filepath, relative or absolute.
As long as you use separate config files between executions, it is safe to run several processes simultaneously.
Sometimes you won't be able to execute the snakefile and you'll get an error telling you to unlock the DAG (DAG is the workflow as a directed acyclic graph). This may happen if the process unexpectedly stops (such as timing out on a HPC cluster) and no error is reciprocated back to the snakefile. In order to solve this, you'll need to run a command similar to the following:
snakemake -s snakefile.smk --unlock
Sometimes you don't want to run the full pipeline but rather just a specific algorithm step. There's two ways to do this:
We recommend executing specific algorithm components using their corresponding driver script in VAREID/drivers/ for the simplicity of user input and consistent logging with a pipeline execution. We highly recommend staying consistent with the formatting standards layed out by config.yaml! This way, it's extremely easy to switch between executing stages via the pipeline and separately.
Driver scripts require a configfile structured like config.yaml. Once again, your configfile can be supplied by any filepath, relative or absolute.
Since the pipeline was installed as a module, you can easily execute the driver script through this module. No matter what directory you execute from, the path to the driver script will be the same (and relative to VAREID).
python -m VAREID.drivers.[driver_script] --config_path path/to/your_config.yaml
Notice that we didn't include the .py extension on the driver. This is because we're referencing it as a module. Think of this like an import statement, import VAREID.drivers.[driver_script], but you're executing it as a script.
If you don't have a full configfile filled out or would rather not rely on it, you can directly execute each algorithm component using its executable script. Each algorithm component has a separate set of parameters documented with argparse Please follow these parameters for your desired component and supply the necessary paths, flags, etc.
Here is an example on how to run frame_sampling.py:
python -m VAREID.algo.frame_sampling.frame_sampling \
path/to/ia_filtered_annots.json \
path/to/fs_annots.json \
--json_stage1 path/to/stage1_fs_annots.json
Postprocessing is not ran by the pipeline because it requires human interaction to resolve conflicts. To run postprocessing, you can use a driver (see Running the Postprocessing Step above). This driver runs the postprocessing script, waits for a SQLite database file to be created, and then opens a GUI. The GUI will checks the database file until conflicts are posted. Your job is to resolve these.
To run the postprocessing driver, use the following:
python -m VAREID.drivers.post_driver --config_path path/to/your_config.yaml
Wait for a prompt to open a web browser. This is the GUI. Once opened, you'll see a screen similar to the following:
Use the GUI to resolve all conflicts. It will constantly refresh to check whether conflicts have been saved to the database file. Once all conflicts are resolved, the postprocessing script will end and automatically close the GUI.
When working with large datasets with many conflicts to resolve, you may have to stop filling out conflicts and come back later. All conflicts and their resolution status are saved to the database file, which is not reset on a new call to post_driver.py. Thus, you can simply rerun the driver and pick up where you left off.
If your output formatting is inconsistent with the pipeline you'll need to manually execute two scripts found in VAREID/algo/postprocessing/. These are postprocessing.py and gui.py. Please check their argparse parameters for more details.
You will need to execute postprocessing.py first and wait until it blocks on user input. For the database (GUI) method, this will look like the following:
Still waiting for cluster pair 1 - 0 - Checking again in 5 seconds...
At this point, start up gui.py.
Currently, this pipeline is best used on Grevys Zebras and Plains Zebras. However, it can be used for other species.
Every algorithm component in the pipeline has a configfile with some internal variables controlling how the script runs. These parameters can be coefficients, flags, labels, etc. YOU SHOULD RARELY NEED TO CHANGE THESE PARAMETERS!
All config files are found in VAREID/algo within their corresponding algorithm component subfolders. Please refer to any comments in these files before making changes.
To modify the species labels the pipeline will classify into, modify the following fields:
-
custom_labelsinspecies_identifier_config.yaml:This is a list of the species to classify annotations into. This is CLIP-based so you can format the names any way you wish.
-
filtered_classesinviewpoint_classifier_config.yaml:This is a list of the species to generate viewpoint classifications for. This should match
#1.
Please see the documentation in these scripts, which is usually done via argparse.
- National Science Foundation (NSF) funded AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) (OAC 2112606).
- Imageomics Institute (A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning) is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under Award (OAC 2118240).
- Support from Rensselaer Polytechnic Institute (RPI).
- Support from Finnish Cultural Foundation.
- Resources from Ohio Supercomputer Center made it possible to train and test algorithmic components.
Ankit K. Upadhyay, Ekaterina Nepovinnykh, S. M. Rayeed, Aidan Westphal, Lawrence Miao, Julian Bain, Jaeseok Kang, Tuomas Eerola, Heikki Kälviäinen, Charles V. Stewart. Animal Re-Identification via Multiview Spatio-Temporal Track Clustering. Rensselaer Polytechnic Institute, LUT University, Brno University of Technology, CV4Animals, CVPR 2025.
