Mappings are defined in a XML document. The XML follows the structure of the following XML Schema:
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="id" type="xs:anyURI"/>
<xs:element name="nodepath" type="xs:string"/>
<xs:element name="type" type="mappingType"/>
<xs:simpleType name="mappingType">
<xs:restriction base="xs:string">
<xs:enumeration value="Arch2Class"/>
<xs:enumeration value="Arch2Prop"/>
<xs:enumeration value="Arch2Rel"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="class">
<xs:complexType>
<xs:sequence>
<xs:element ref="id"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="arch">
<xs:complexType>
<xs:sequence>
<xs:element ref="nodepath" minOccurs="0"/>
<xs:element ref="valuepath" maxOccurs="unbounded" minOccurs="0"/>
<xs:element ref="infopath" maxOccurs="unbounded" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="valuepath" type="xs:string"/>
<xs:element name="source">
<xs:complexType>
<xs:sequence>
<xs:element ref="class"/>
<xs:element ref="arch"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="predicate">
<xs:complexType>
<xs:sequence>
<xs:element ref="id"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="target">
<xs:complexType>
<xs:choice>
<xs:sequence>
<xs:element ref="class"/>
<xs:element ref="arch"/>
</xs:sequence>
<xs:sequence>
<xs:element ref="arch"/>
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:element>
<xs:element name="infopath" type="xs:string"/>
<xs:element name="map">
<xs:complexType>
<xs:sequence>
<xs:element ref="type"/>
<xs:element ref="class" minOccurs="0"/>
<xs:element ref="arch" minOccurs="0"/>
<xs:element ref="source" minOccurs="0"/>
<xs:element ref="predicate" minOccurs="0"/>
<xs:element ref="target" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Alignment">
<xs:complexType>
<xs:sequence>
<xs:element ref="map" maxOccurs="unbounded" minOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
SWIT uses three types of basic mapping rules to transform entities, attributes and relations:
To illustrate the definition of mapping rules we will use as an example orthology data following the OrthoXML format and the OGO ontology. You can see more examples in our Documentation page.
A basic definition of an entity rule looks like this:
<map>
<type>Arch2Class</type>
<class>
<id>http://miuras.inf.um.es/ontologies/OGO.owl#Gene</id>
</class>
<arch>
<nodepath>/orthoXML/species/database/genes/gene/@id</nodepath>
</arch>
</map>
Every mapping rule starts with the label map. With type we specify the type of rule we are defining: Arch2Class for entity rules, Arch2Prop for attribute rules and Arch2Rel for relation rules.
The entity rule is divided in two parts, class and arch. In class we specify the URI of the OWL class for which we want to create new individuals using the label id. In arch we specify which data from the input source is going to be transformed into individuals of the OWL class. For XML data, we specify a XPath expression inside a nodepath label.
The previous example will create an individual of the class Gene for each element gene with attribute @id in the path /orthoXML/species/database/genes/ that is found in the input data, as shown in the following example:
...
<genes>
<gene id="1" geneId="WBGene00000962" protId="CE23997"/>
<gene id="5" geneId="WBGene00006801" protId="CE43332"/>
</genes>
...
@prefix : <http://miuras.inf.um.es/ontologies/swit/OGO.owl#> .
@prefix OGO: <http://miuras.inf.um.es/ontologies/OGO.owl#> .
Individual: gene_1
Types:
OGO:Gene
Individual: gene_5
Types:
OGO:Gene
A basic definition of an attribute rule looks like this:
<map>
<type>Arch2Prop</type>
<source>
<class>
<id>http://miuras.inf.um.es/ontologies/OGO.owl#Gene</id>
</class>
<arch>
<nodepath>/orthoXML/species/database/genes/gene/@id</nodepath>
</arch>
</source>
<predicate>
<id>
http://miuras.inf.um.es/ontologies/OGO.owl#Identifier
</id>
</predicate>
<target>
<arch>
<valuepath>../@geneId</valuepath>
</arch>
</target>
</map>
The attribute rule is divided in three parts, source, predicate and target. In source we specify a regular entity rule. In predicate we specify the URI of the OWL datatype property for which we want to give a value to our new individuals. In target we specify which data from the input source is going to be transformed into values for the OWL datatype property using the label valuepath. For XML data, valuepath contains an XPath expression that is always a relative path to the one in the source section of the mapping. Note that the XPATH expression /orthoXML/species/database/genes/gene/@id/../@geneId is equivalent to the XPATH /orthoXML/species/database/genes/gene/@geneId.
The previous example will give value to the OWL datatype property Identifier for each Gene individual created. That value will be taken from the attribute @geneId that is accesible in the path /orthoXML/species/database/genes/gene/@id/../@geneId.
...
<genes>
<gene id="1" geneId="WBGene00000962" protId="CE23997"/>
<gene id="5" geneId="WBGene00006801" protId="CE43332"/>
</genes>
...
@prefix : <http://miuras.inf.um.es/ontologies/swit/OGO.owl#> .
@prefix OGO: <http://miuras.inf.um.es/ontologies/OGO.owl#> .
Individual: gene_1
Types:
OGO:Gene
Facts:
OGO:Identifier "WBGene00000962"^^xsd:string
Individual: gene_5
Types:
OGO:Gene
Facts:
OGO:Identifier "WBGene00006801"^^xsd:string
A basic definition of a relation rule looks like this:
<map>
<type>Arch2Rel</type>
<source>
<class>
<id>http://miuras.inf.um.es/ontologies/OGO.owl#Gene</id>
</class>
<arch>
<nodepath>/orthoXML/species/database/genes/gene/@id</nodepath>
</arch>
</source>
<predicate>
<id>
http://miuras.inf.um.es/ontologies/OGO.owl#hasResource
</id>
</predicate>
<target>
<class>
<id>http://miuras.inf.um.es/ontologies/OGO.owl#Resource</id>
</class>
<arch>
<nodepath>../../../.</nodepath>
</arch>
</target>
</map>
The relation rule is very similar to the attribute rule. It is divided in three parts, source, predicate and target. Both source and target specify a regular entity rule defining the individuals that are to be linked, with the particularity that the XPath expression used to get the data in the target is relative to the XPath in the source. Note that the XPATH expression /orthoXML/species/database/genes/gene/@id/../../../. is equivalent to the XPATH expression /orthoXML/species/database. In predicate we specify the URI of the OWL object property used to link the individuals.
The previous example will link a Resource individual with a Gene individual through the hasResource property everytime a gene label is found inside a database label.
...
<species name="Caenorhabditis elegans" NCBITaxId="6239">
<database name="WormBase"
version="Caenorhabditis-elegans_WormBase_WS199_protein-all.fa">
<genes>
<gene id="1" geneId="WBGene00000962" protId="CE23997"/>
<gene id="5" geneId="WBGene00006801" protId="CE43332"/>
</genes>
...
@prefix : <http://miuras.inf.um.es/ontologies/swit/OGO.owl#> .
@prefix OGO: <http://miuras.inf.um.es/ontologies/OGO.owl#> .
Individual: gene_1
Types:
OGO:Gene
Facts:
OGO:Identifier "WBGene00000962"^^xsd:string,
OGO:hasResource resource_1
Individual: gene_5
Types:
OGO:Gene
Facts:
OGO:Identifier "WBGene00006801"^^xsd:string,
OGO:hasResource resource_1
Individual: resource_1
Types:
OGO:Resource
In progress...