SBML annotator

sbmlutils provides functionality for annotating SBML models. Annotation is the process of adding metadata to the model and model components. These metadata are mostly from biological ontologies or biological databases.

from sbmlutils.report import sbmlreport

Annotate existing model

In the first example annotations from an excel file are added to an existing model. The following annotations are written to the ./annotations/demo.xml based on pattern matching.

Annotations are written for the given sbml_type for all SBML identifiers which match the given pattern.

from sbmlutils.annotation.annotator import ModelAnnotator
df = ModelAnnotator.read_annotations_df("./annotations/demo_annotations.xlsx", format="xlsx")
df
pattern sbml_type annotation_type qualifier resource name
0 NaN document rdf BQM_IS sbo/SBO:0000293 non-spatial continuous framework
1 ^demo_\d+$ model rdf BQM_IS go/GO:0008152 metabolic process
3 e compartment rdf BQB_IS sbo/SBO:0000290 physical compartment
4 e compartment rdf BQB_IS go/GO:0005615 extracellular space
5 e compartment rdf BQB_IS fma/FMA:70022 extracellular space
7 m compartment rdf BQB_IS sbo/SBO:0000290 physical compartment
8 m compartment rdf BQB_IS go/GO:0005886 plasma membrane
9 m compartment rdf BQB_IS fma/FMA:63841 plasma membrane
11 c compartment rdf BQB_IS sbo/SBO:0000290 physical compartment
12 c compartment rdf BQB_IS go/GO:0005623 cell
13 c compartment rdf BQB_IS fma/FMA:68646 cell
15 ^Km_\w+$ parameter rdf BQB_IS sbo/SBO:0000027 Michaelis constant
16 ^Keq_\w+$ parameter rdf BQB_IS sbo/SBO:0000281 equilibrium constant
17 ^Vmax_\w+$ parameter rdf BQB_IS sbo/SBO:0000186 maximal velocity
19 ^\w{1}__A$ species rdf BQB_IS sbo/SBO:0000247 simple chemical
20 ^\w{1}__B$ species rdf BQB_IS sbo/SBO:0000247 simple chemical
21 ^\w{1}__C$ species rdf BQB_IS sbo/SBO:0000247 simple chemical
22 ^\w{1}__\w+$ species formula NaN C6H12O6 NaN
23 ^\w{1}__\w+$ species charge NaN 0 NaN
24 ^b\w{1}$ reaction rdf BQB_IS sbo/SBO:0000185 transport reaction
25 ^v\w{1}$ reaction rdf BQB_IS sbo/SBO:0000176 biochemical reaction
from sbmlutils.annotation.annotator import annotate_sbml_file

# create SBML report without performing units checks
annotate_sbml_file(f_sbml="./annotations/demo.xml",
                   f_annotations="./annotations/demo_annotations.xlsx",
                   f_sbml_annotated="./annotations/demo_annotated.xml")

Annotate during model creation

In the second example the model is annotated during the model creation process. Annotations are encoded as simple tuples consisting of MIRIAM identifiers terms and identifiers.org parts.

The list of tuples can be provided on object generation

Species(sid='e__gal', compartment='ext', initialConcentration=3.0,
            substanceUnit=UNIT_KIND_MOLE, boundaryCondition=True,
            name='D-galactose', sboTerm=SBO_SIMPLE_CHEMICAL,
            annotations=[
                (BQB.IS, "bigg.metabolite/gal"),  # galactose
                (BQB.IS, "chebi/CHEBI:28061"),  # alpha-D-galactose
                (BQB.IS, "vmhmetabolite/gal"),
            ]
        ),

For the full example see model_with_annotations.py

import os
from sbmlutils.modelcreator.creator import Factory
factory = Factory(modules=['model_with_annotations'],
                  target_dir='./models')
[_, _, sbml_path] = factory.create()

# check the annotations on the species
import libsbml
doc = libsbml.readSBMLFromFile(sbml_path)  # type: libsbml.SBMLDocument
model = doc.getModel()  # type: libsbml.Model
s1 = model.getSpecies('e__gal')  # type: libsbml.Species
print(s1.toSBML())
WARNING:sbmlutils.annotation.annotator:https://en.wikipedia.org/wiki/Cytosol does not conform to http(s)://identifiers.org/collection/id

--------------------------------------------------------------------------------
/home/mkoenig/git/sbmlutils/docs_builder/notebooks/models/annotation_example_8.xml
valid                    : TRUE
check time (s)           : 0.011
--------------------------------------------------------------------------------

SBML report created: ./models/annotation_example_8.html
<species metaid="meta_e__gal" sboTerm="SBO:0000247" id="e__gal" name="D-galactose" compartment="ext" initialConcentration="3" substanceUnits="mole" hasOnlySubstanceUnits="false" boundaryCondition="true" constant="false">
  <annotation>
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:vCard4="http://www.w3.org/2006/vcard/ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
      <rdf:Description rdf:about="#meta_e__gal">
        <bqbiol:is>
          <rdf:Bag>
            <rdf:li rdf:resource="https://identifiers.org/bigg.metabolite/gal"/>
            <rdf:li rdf:resource="https://identifiers.org/chebi/CHEBI:28061"/>
            <rdf:li rdf:resource="https://identifiers.org/vmhmetabolite/gal"/>
          </rdf:Bag>
        </bqbiol:is>
      </rdf:Description>
    </rdf:RDF>
  </annotation>
</species>