Usage
This script facilitates the downloading of ImmunoFluorescent (IF) labeled images from the Human Protein Atlas (HPA). The tool requires an output directory to write results to and either a TSV file in CM4AI RO-Crate format, or CSV file with list of IF images to download and CSV file of unique samples.
In a project
To use cellmaps_imagedownloader in a project:
import cellmaps_imagedownloader
On the command line
For information invoke cellmaps_imagedownloadercmd.py -h
Usage
cellmaps_imagedownloadercmd.py OUTPUT_DIRECTORY [--provenance PROVENANCE_PATH] [OPTIONS]
Arguments
outdir
The directory where the output will be written to.
Required
--provenance PROVENANCE_PATH
Path to file containing provenance information about input files in JSON format.
Optional but either `samples`, `cm4ai_table`, `protein_list` or `cell_line` parameter is required
--samples SAMPLES_PATH
CSV file with list of IF images to download. The file follow a specific format with columns such as filename, if_plate_id, position, sample, locations, antibody, ensembl_ids, and gene_names.
--protein_list
List of proteins for which HPA images will be downloaded. Each protein in new line.
--cell_line
Cell line for which HPA images will be downloaded. See available cell lines at https://www.proteinatlas.org/humanproteome/cell+line.
--cm4ai_table CM4AI_TABLE_PATH
Path to TSV file in CM4AI RO-Crate directory.
Optional
--unique UNIQUE_PATH
: (Deprecated: Using –samples flag only is enough) CSV file of unique samples. The file should have columns like antibody, ensembl_ids, gene_names, atlas_name, locations, and n_location.--proteinatlasxml
: URL or path toproteinatlas.xml
orproteinatlas.xml.gz
file.--fake_images
: If set, the first image of each color is downloaded, and subsequent images are copies of those images. If--cm4ai_table
flag is set, the--fake_images
flag is ignored.--poolsize
: If using multiprocessing image downloader, this sets the number of current downloads to run.--imgsuffix
: Suffix for images to download (default is.jpg
).--skip_existing
: If set, skips download if the image already exists and has a size greater than 0 bytes.--skip_failed
: If set, ignores images that failed to download after retries.--logconf
: Path to the python logging configuration file.--skip_logging
: If set, certain log files will not be created.--verbose
,-v
: Increases verbosity of logger to standard error for log messages.--version
: Shows the current version of the tool.
Example usage
The example file can be downloaded from cm4ai.org. Go to Products -> Data, log in, and download file for IF images with the desired treatment,
then unpack the tar.gz (tar -xzvf filename.tar.gz
).
cellmaps_imagedownloadercmd.py ./cellmaps_imagedownloader_outdir --cm4ai_table path/to/downloaded/unpacked/dir --provenance examples/provenance.json
Alternatively, use the files in the example directory in the repository:
samples file: CSV file with list of IF images to download (see sample samples file in examples folder)
unique file: CSV file of unique samples (see sample unique file in examples folder)
provenance: file containing provenance information about input files in JSON format (see sample provenance file in examples folder)
cellmaps_imagedownloadercmd.py ./cellmaps_imagedownloader_outdir --samples examples/samples.csv --unique examples/unique.csv --provenance examples/provenance.json
Via Docker
Example usage
Coming soon...