Curate AnnData based on the CELLxGENE schemaΒΆ

This guide shows how to curate an AnnData object with the help of laminlabs/cellxgene and the CELLxGENE schema v5.1.0.

Load your instance to register the curated AnnData:

# !pip install 'lamindb[bionty,jupyter]' cellxgene-lamin cellxgene-schema
!lamin init --storage ./test-cellxgene-curate --schema bionty
Hide code cell output
❗ Full backed capabilities are not available for this version of anndata, please install anndata>=0.9.1.
πŸ’‘ connected lamindb: testuser1/test-cellxgene-curate
import lamindb as ln
import bionty as bt
import cellxgene_lamin as cxg
πŸ’‘ connected lamindb: testuser1/test-cellxgene-curate
❗ Full backed capabilities are not available for this version of anndata, please install anndata>=0.9.1.

Let’s start with an AnnData object that we’d like to inspect and curate:

adata = cxg.datasets.anndata_human_immune_cells(populate_registries=True)
adata.write_h5ad("anndata_human_immune_cells.h5ad")
adata
Hide code cell output
AnnData object with n_obs Γ— n_vars = 1626 Γ— 36503
    obs: 'donor', 'tissue', 'cell_type', 'assay', 'sex_ontology_term_id', 'organism'
    var: 'feature_is_filtered'
    uns: 'default_embedding'
    obsm: 'X_umap'
!cellxgene-schema validate anndata_human_immune_cells.h5ad
Hide code cell output
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: Add labels error: Column 'cell_type' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'assay' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'organism' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'tissue' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: 'title' in 'uns' is not present.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
ERROR: Dataframe 'obs' is missing column 'cell_type_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'assay_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'disease_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'organism_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'tissue_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'self_reported_ethnicity_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'development_stage_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'is_primary_data'.
ERROR: Dataframe 'obs' is missing column 'donor_id'.
ERROR: Dataframe 'obs' is missing column 'suspension_type'.
ERROR: Dataframe 'obs' is missing column 'tissue_type'.
Validation complete in 0:00:00.312469 with status is_valid=False

Validate and curate metadataΒΆ

Validate the AnnData object:

curate = cxg.Curate(adata, organism="human")
Hide code cell output
βœ… added 1 record with Feature.name for columns: 'sex_ontology_term_id'
βœ… added 4 records from laminlabs/cellxgene with Feature.name for columns: 'assay', 'cell_type', 'tissue', 'organism'
❗ 1 non-validated categories are not saved in Feature.name: ['donor']!
      β†’ to lookup categories, use lookup().columns
      β†’ to save, run add_new_from_columns
βœ… added 230 records from laminlabs/cellxgene with Gene.ensembl_gene_id for var_index: 'ENSG00000112096', 'ENSG00000182230', 'ENSG00000203441', 'ENSG00000203812', 'ENSG00000204092', 'ENSG00000214970', 'ENSG00000215067', 'ENSG00000215271', 'ENSG00000221995', 'ENSG00000223458', 'ENSG00000223797', 'ENSG00000224167', 'ENSG00000224247', 'ENSG00000224739', 'ENSG00000224745', 'ENSG00000225205', 'ENSG00000225932', 'ENSG00000226277', 'ENSG00000226362', 'ENSG00000226377', ...

Let’s fix the β€œdonor_id” column name:

adata.obs.rename(columns={"donor": "donor_id"}, inplace=True)
validated = curate.validate()
❌ missing required obs columns development_stage, disease, self_reported_ethnicity, suspension_type, tissue_type
πŸ’‘ consider initializing a Curate object like 'Curate(adata, defaults=cxg.CellxGeneFields.OBS_FIELD_DEFAULTS)'to automatically add these columns with default values.

For the missing columns, we can pass default values suggested from CELLxGENE:

cxg.CellxGeneFields.OBS_FIELD_DEFAULTS
Hide code cell output
{'assay': 'unknown',
 'cell_type': 'native_cell',
 'development_stage': 'unknown',
 'disease': 'normal',
 'donor_id': 'na',
 'self_reported_ethnicity': 'unknown',
 'sex': 'unknown',
 'suspension_type': 'cell',
 'tissue_type': 'tissue',
 'organism': 'unknown'}
curate = cxg.Curate(adata, defaults=cxg.CellxGeneFields.OBS_FIELD_DEFAULTS, organism="human")
Hide code cell output
πŸ’‘ added defaults to the AnnData object: {'development_stage': 'unknown', 'disease': 'normal', 'self_reported_ethnicity': 'unknown', 'suspension_type': 'cell', 'tissue_type': 'tissue'}
βœ… added 6 records from laminlabs/cellxgene with Feature.name for columns: 'development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'tissue_type', 'suspension_type'
validated = curate.validate()
validated
Hide code cell output
πŸ’‘ validating metadata using registries of instance laminlabs/cellxgene
βœ… var_index is validated against Gene.ensembl_gene_id
πŸ’‘ mapping assay on ExperimentalFactor.name
❗    found 3 terms validated terms: ["10x 3' v3", "10x 5' v2", "10x 5' v1"]
      β†’ save terms via .add_validated_from('assay')
βœ… assay is validated against ExperimentalFactor.name
βœ… cell_type is validated against CellType.name
πŸ’‘ mapping development_stage on DevelopmentalStage.name
❗    found 1 terms validated terms: ['unknown']
      β†’ save terms via .add_validated_from('development_stage')
βœ… development_stage is validated against DevelopmentalStage.name
πŸ’‘ mapping disease on Disease.name
❗    found 1 terms validated terms: ['normal']
      β†’ save terms via .add_validated_from('disease')
βœ… disease is validated against Disease.name
πŸ’‘ mapping donor_id on ULabel.name
❗    12 terms are not validated: 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1'
      β†’ save terms via .add_new_from('donor_id')
πŸ’‘ mapping self_reported_ethnicity on Ethnicity.name
❗    found 1 terms validated terms: ['unknown']
      β†’ save terms via .add_validated_from('self_reported_ethnicity')
βœ… self_reported_ethnicity is validated against Ethnicity.name
πŸ’‘ mapping sex_ontology_term_id on Phenotype.ontology_id
❗    found 1 terms validated terms: ['PATO:0000384']
      β†’ save terms via .add_validated_from('sex_ontology_term_id')
βœ… sex_ontology_term_id is validated against Phenotype.ontology_id
πŸ’‘ mapping suspension_type on ULabel.name
❗    found 1 terms validated terms: ['cell']
      β†’ save terms via .add_validated_from('suspension_type')
βœ… suspension_type is validated against ULabel.name
πŸ’‘ mapping tissue on Tissue.name
❗    found 16 terms validated terms: ['blood', 'thoracic lymph node', 'spleen', 'mesenteric lymph node', 'lamina propria', 'liver', 'jejunal epithelium', 'omentum', 'bone marrow', 'ileum', 'caecum', 'thymus', 'skeletal muscle tissue', 'duodenum', 'sigmoid colon', 'transverse colon']
      β†’ save terms via .add_validated_from('tissue')
❗    1 terms is not validated: 'lungg'
      β†’ save terms via .add_new_from('tissue')
πŸ’‘ mapping tissue_type on ULabel.name
❗    found 1 terms validated terms: ['tissue']
      β†’ save terms via .add_validated_from('tissue_type')
βœ… tissue_type is validated against ULabel.name
βœ… organism is validated against Organism.name
False

Register new metadata labelsΒΆ

Following the suggestions above to register genes and labels that aren’t present in the current instance:

(Note that our instance is rather empty. Once you filled up the registries, registering new labels won’t be frequently needed)

curate.add_validated_from("all")
Hide code cell output
πŸ’‘ saving labels for 'assay'
βœ… added 3 records from laminlabs/cellxgene with ExperimentalFactor.name for assay: '10x 5' v1', '10x 5' v2', '10x 3' v3'
πŸ’‘ saving labels for 'cell_type'
πŸ’‘ saving labels for 'development_stage'
βœ… added 1 record from laminlabs/cellxgene with DevelopmentalStage.name for development_stage: 'unknown'
πŸ’‘ saving labels for 'disease'
βœ… added 1 record from laminlabs/cellxgene with Disease.name for disease: 'normal'
πŸ’‘ saving labels for 'donor_id'
❗ 12 non-validated categories are not saved in ULabel.name: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']!
      β†’ to lookup categories, use lookup().donor_id
      β†’ to save, run .add_new_from('donor_id')
πŸ’‘ saving labels for 'self_reported_ethnicity'
βœ… added 1 record from laminlabs/cellxgene with Ethnicity.name for self_reported_ethnicity: 'unknown'
πŸ’‘ saving labels for 'sex_ontology_term_id'
βœ… added 1 record from laminlabs/cellxgene with Phenotype.ontology_id for sex_ontology_term_id: 'PATO:0000384'
πŸ’‘ saving labels for 'suspension_type'
βœ… added 1 record from laminlabs/cellxgene with ULabel.name for suspension_type: 'cell'
πŸ’‘ saving labels for 'tissue'
❗ 1 non-validated categories are not saved in Tissue.name: ['lungg']!
      β†’ to lookup categories, use lookup().tissue
      β†’ to save, run .add_new_from('tissue')
βœ… added 16 records from laminlabs/cellxgene with Tissue.name for tissue: 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', 'mesenteric lymph node', 'caecum', 'omentum', 'blood', 'ileum', 'thoracic lymph node'
πŸ’‘ saving labels for 'tissue_type'
βœ… added 1 record from laminlabs/cellxgene with ULabel.name for tissue_type: 'tissue'
πŸ’‘ saving labels for 'organism'

For donors, we register the new labels:

curate.add_new_from("donor_id")
Hide code cell output
βœ… added 12 records with ULabel.name for donor_id: 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1'

An error is shown for the tissue label β€œlungg”, which is a typo, should be β€œlung”. Let’s fix it:

tissues = curate.lookup().tissue
# using a lookup object to find the correct term
tissues.lung
Hide code cell output
Tissue(uid='7Tt4iEKc', name='lung', ontology_id='UBERON:0002048', synonyms='pulmo', description='Respiration Organ That Develops As An Outpocketing Of The Esophagus.', created_by_id=1, source_id=47, updated_at='2024-01-08 15:22:49 UTC')
adata.obs["tissue"] = adata.obs["tissue"].cat.rename_categories(
    {"lungg": tissues.lung.name}
)
curate.add_validated_from("tissue")
Hide code cell output
βœ… added 1 record from laminlabs/cellxgene with Tissue.name for tissue: 'lung'

Let’s validate the object again:

validated = curate.validate()
validated
Hide code cell output
πŸ’‘ validating metadata using registries of instance laminlabs/cellxgene
βœ… var_index is validated against Gene.ensembl_gene_id
βœ… assay is validated against ExperimentalFactor.name
βœ… cell_type is validated against CellType.name
βœ… development_stage is validated against DevelopmentalStage.name
βœ… disease is validated against Disease.name
βœ… donor_id is validated against ULabel.name
βœ… self_reported_ethnicity is validated against Ethnicity.name
βœ… sex_ontology_term_id is validated against Phenotype.ontology_id
βœ… suspension_type is validated against ULabel.name
βœ… tissue is validated against Tissue.name
βœ… tissue_type is validated against ULabel.name
βœ… organism is validated against Organism.name
True
adata.obs.head()
Hide code cell output
donor_id tissue cell_type assay sex_ontology_term_id organism development_stage disease self_reported_ethnicity suspension_type tissue_type
CZINY-0109_CTGGTCTAGTCTGTAC D496-1 blood classical monocyte 10x 3' v3 PATO:0000384 human unknown normal unknown cell tissue
CZI-IA10244332+CZI-IA10244434_CCTTCGACATACTCTT 621B-1 thoracic lymph node T follicular helper cell 10x 5' v2 PATO:0000384 human unknown normal unknown cell tissue
Pan_T7935491_CTGGTCTGTACATGTC A29-1 spleen memory B cell 10x 5' v1 PATO:0000384 human unknown normal unknown cell tissue
Pan_T7980367_GGGCATCCAGGTGGAT A36-1 lung alveolar macrophage 10x 5' v1 PATO:0000384 human unknown normal unknown cell tissue
Pan_T7935494_ATCATGGTCTACCTGC A29-1 mesenteric lymph node naive thymus-derived CD4-positive, alpha-beta ... 10x 5' v1 PATO:0000384 human unknown normal unknown cell tissue

Save artifactΒΆ

artifact = curate.save_artifact(description="test h5ad file")
Hide code cell output
❗ no run & transform get linked, consider calling ln.track()
πŸ’‘ path content will be copied to default storage upon `save()` with key `None` ('.lamindb/AzZUNe7cnxpfmLIaLEZU.h5ad')
βœ… storing artifact 'AzZUNe7cnxpfmLIaLEZU' at '/home/runner/work/cellxgene-lamin/cellxgene-lamin/docs/test-cellxgene-curate/.lamindb/AzZUNe7cnxpfmLIaLEZU.h5ad'
πŸ’‘ you can auto-track these data as a run input by calling `ln.track()`
πŸ’‘ parsing feature names of X stored in slot 'var'
βœ…    36503 terms (100.00%) are validated for ensembl_gene_id
βœ…    linked: FeatureSet(uid='jjDWb1d9DkLnB8cByJ3u', n=36503, dtype='float', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZHA', created_by_id=1)
πŸ’‘ parsing feature names of slot 'obs'
βœ…    11 terms (100.00%) are validated for name
βœ…    linked: FeatureSet(uid='oWMUwNpcCU3pW0Ae7uM8', n=11, registry='Feature', hash='GfO-W_OdFI0ReY7nrpjaQw', created_by_id=1)
βœ… saved 2 feature sets for slots: 'var','obs'
artifact.describe()
Hide code cell output
Artifact(uid='AzZUNe7cnxpfmLIaLEZU', description='test h5ad file', suffix='.h5ad', type='dataset', _accessor='AnnData', size=54727155, hash='XppipaxP9Y03mQBGcMpgTI', _hash_type='sha1-fl', n_observations=1626, visibility=1, _key_is_virtual=True, updated_at='2024-08-02 16:10:49 UTC')
  Provenance
    .created_by = 'testuser1'
    .storage = '/home/runner/work/cellxgene-lamin/cellxgene-lamin/docs/test-cellxgene-curate'
  Labels
    .organisms = 'human'
    .tissues = 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
    .cell_types = 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
    .diseases = 'normal'
    .phenotypes = 'male'
    .experimental_factors = '10x 5' v1', '10x 5' v2', '10x 3' v3'
    .developmental_stages = 'unknown'
    .ethnicities = 'unknown'
    .ulabels = 'cell', 'tissue', 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', ...
  Features
    'assay' = '10x 5' v1', '10x 5' v2', '10x 3' v3'
    'cell_type' = 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
    'development_stage' = 'unknown'
    'disease' = 'normal'
    'donor_id' = 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', ...
    'organism' = 'human'
    'self_reported_ethnicity' = 'unknown'
    'sex_ontology_term_id' = 'male'
    'suspension_type' = 'cell'
    'tissue' = 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
    'tissue_type' = 'tissue'
  Feature sets
    'var' = 'MIR1302-2HG', 'FAM138A', 'OR4F5', 'None', 'OR4F29', 'OR4F16', 'LINC01409', 'FAM87B', 'LINC01128', 'LINC00115', 'FAM41C'
    'obs' = 'assay', 'cell_type', 'tissue', 'organism', 'sex_ontology_term_id', 'development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'tissue_type', 'suspension_type'

The below is optional – it mimics the way cellxgene creates collections of AnnData objects to link them to studies.

# register a new collection
title = "Cross-tissue immune cell analysis reveals tissue-specific features in humans (for test demo only)"
collection = ln.Collection(
    [artifact],  # registered artifact above, can also pass a list of artifacts
    name=title,  # title of the publication
    description="10.1126/science.abl5197",  # DOI of the publication
    reference="E-MTAB-11536",  # accession number (e.g. GSE#, E-MTAB#, etc.)
    reference_type="ArrayExpress",  # source type (e.g. GEO, ArrayExpress, SRA, etc.)
).save()
Hide code cell output
❗ no run & transform get linked, consider calling ln.track()
πŸ’‘ you can auto-track these data as a run input by calling `ln.track()`

Return an input h5ad file for cellxgene-schemaΒΆ

adata_cxg = curate.to_cellxgene_anndata(is_primary_data=True, title=title)
adata_cxg
Hide code cell output
AnnData object with n_obs Γ— n_vars = 1626 Γ— 36503
    obs: 'donor_id', 'sex_ontology_term_id', 'suspension_type', 'tissue_type', 'tissue_ontology_term_id', 'cell_type_ontology_term_id', 'assay_ontology_term_id', 'organism_ontology_term_id', 'development_stage_ontology_term_id', 'disease_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data'
    var: 'feature_is_filtered'
    uns: 'default_embedding', 'title', 'cxg_lamin_schema_reference', 'cxg_lamin_schema_version'
    obsm: 'X_umap'
adata_cxg.write_h5ad("anndata_human_immune_cells_cxg.h5ad")
!cellxgene-schema validate anndata_human_immune_cells_cxg.h5ad
Hide code cell output
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
Validation complete in 0:00:03.199391 with status is_valid=False

Note

The Curate class is designed to validate all metadata for adherence to ontologies. It does not reimplement all rules of the cellxgene schema and we therefore recommend running the cellxgene-schema if full adherence beyond metadata is a necessity.