Skip to main content

Data Troubleshooting

Some common issues may arise when writing or converting data to use with Vitessce.

AnnData-Zarr paths

When an AnnData object is written to a Zarr store (e.g., via adata.write_zarr), the columns and keys in the original object (e.g., adata.obs["leiden"] or adata.obsm["X_umap"]) become relative POSIX-style paths (e.g., obs/leiden and obsm/X_umap) in the Zarr store.

AnnData-Zarr obsFeatureMatrix chunking strategy

A benefit of the Zarr format is that arrays can be chunked and stored in small pieces. In Vitessce, we leverage the chunking features of Zarr to load only the subset of the obsFeatureMatrix which is required for each visualization.

For instance, if a gene is selected to color the points in the scatterplot or spatial views, we only load the chunks containing the gene of interest.

However, a poor chunking strategy (e.g., each chunk containing too many genes) can reduce the efficiency of this approach and result in too much data being requested when a gene is selected.

A chunks argument can be passed to the AnnData write_zarr method to resolve this:

# ...
VAR_CHUNK_SIZE = 10 # VAR_CHUNK_SIZE should be small
adata.write_zarr(out_path, chunks=(adata.shape[0], VAR_CHUNK_SIZE))

Zarr dtypes

Vitessce uses Zarr.js to load Zarr data. Zarr.js currently supports a subset of NumPy data types, so ensure that the types used in the arrays and data frames of your AnnData store are supported (otherwise cast using np.astype or pd.astype). In addition to the Zarr.js data types, Vitessce supports loading AnnData string columns with vlen-utf8 or |O types.

To automatically do this casting for AnnData objects, the vitessce Python package provides the optimize_adata function:

from vitessce.data_utils import optimize_adata
# ...
adata = optimize_adata(adata)
# ...

OME-NGFF

Supported versions

Vitessce currently supports up to OME-NGFF spec v0.4.

Supported features

Vitessce supports OME-NGFF images saved as Zarr stores and a subset of OME-NGFF features via the image.ome-zarr file type. The following table lists the support for different OME-NGFF features:

FeatureSupported by Vitessce
Downsampling along Z axisN
omero fieldY
multiscales with a scaling factor other than 2N
URL (not only S3)Y
3D viewY
labelsN
HCS plateN

To compare Vitessce to other OME-NGFF clients, see the table listing the OME-NGFF features supported by other clients. We welcome feature requests or pull requests to add support for the remaining features to Vitessce.

Metadata requirements

The omero metadata field must be present. omero.channels and omero.rdefs fields provide metadata that Vitessce uses for the initial rendering settings and must be present.

Z-axis chunking

Vitessce does not yet support chunking along the Z axis. When writing OME-Zarr stores, you may need to specify a chunks argument manually such that the Z axis only has 1 chunk.

An example writing to a Zarr store using ome-zarr-py (ome-zarr==0.2.1):

import zarr
import numpy as np
from tifffile import imread
from ome_zarr import writer

my_image = imread("my_image.tif")
my_image = np.transpose(my_image, axes=(1, 0, 3, 2)) # zcxy to czyx

z_root = zarr.open_group("my_image.zarr", mode = "w")

default_window = {
"start": 0,
"min": 0,
"max": 65_535, # may need to change depending on the numpy dtype of the my_image array
"end": 65_535 # may need to change depending on the numpy dtype of the my_image array
}

writer.write_image(
image = my_image,
group = z_root,
axes = "czyx",
omero = {
"name": "My image",
"version": "0.3",
"rdefs": {},
"channels": [
{
"label": f"Channel {i}",
"color": "FFFFFF", # may want to use a different color for each channel
"window": default_window
} for i in range(my_image.shape[0])
]
},
chunks = (1, 1, 256, 256),
)