Semantic Web Integration Tool (SWIT)

Mappings tutorial

Mappings sintax

Mappings are defined in a XML document. The XML follows the structure of the following XML Schema:

<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="id" type="xs:anyURI"/>
  <xs:element name="nodepath" type="xs:string"/>
  <xs:element name="type" type="mappingType"/>
  <xs:simpleType name="mappingType">
	<xs:restriction base="xs:string">
	  <xs:enumeration value="Arch2Class"/>
	  <xs:enumeration value="Arch2Prop"/>
	  <xs:enumeration value="Arch2Rel"/>
	</xs:restriction>
  </xs:simpleType>
  <xs:element name="class">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="id"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="arch">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="nodepath" minOccurs="0"/>
        <xs:element ref="valuepath" maxOccurs="unbounded" minOccurs="0"/>
        <xs:element ref="infopath" maxOccurs="unbounded" minOccurs="0"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="valuepath" type="xs:string"/>
  <xs:element name="source">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="class"/>
        <xs:element ref="arch"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="predicate">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="id"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="target">
    <xs:complexType>
	  <xs:choice>
		<xs:sequence>
          <xs:element ref="class"/>
          <xs:element ref="arch"/>
        </xs:sequence>	  
	    <xs:sequence>
          <xs:element ref="arch"/>
        </xs:sequence>
	  </xs:choice>      
    </xs:complexType>
  </xs:element>
  <xs:element name="infopath" type="xs:string"/>
 <xs:element name="map">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="type"/>
        <xs:element ref="class" minOccurs="0"/>
        <xs:element ref="arch" minOccurs="0"/>
        <xs:element ref="source" minOccurs="0"/>
        <xs:element ref="predicate" minOccurs="0"/>
        <xs:element ref="target" minOccurs="0"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="Alignment">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="map" maxOccurs="unbounded" minOccurs="1"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

How to define mappings

SWIT uses three types of basic mapping rules to transform entities, attributes and relations:

To illustrate the definition of mapping rules we will use as an example orthology data following the OrthoXML format and the OGO ontology. You can see more examples in our Documentation page.

Entity rule

A basic definition of an entity rule looks like this:


<map>
	<type>Arch2Class</type>
	<class>
		<id>http://miuras.inf.um.es/ontologies/OGO.owl#Gene</id>
	</class>
	<arch>
		<nodepath>/orthoXML/species/database/genes/gene/@id</nodepath>
	</arch>
</map>
	

Every mapping rule starts with the label map. With type we specify the type of rule we are defining: Arch2Class for entity rules, Arch2Prop for attribute rules and Arch2Rel for relation rules.

The entity rule is divided in two parts, class and arch. In class we specify the URI of the OWL class for which we want to create new individuals using the label id. In arch we specify which data from the input source is going to be transformed into individuals of the OWL class. For XML data, we specify a XPath expression inside a nodepath label.

The previous example will create an individual of the class Gene for each element gene with attribute @id in the path /orthoXML/species/database/genes/ that is found in the input data, as shown in the following example:


...
<genes>
	<gene id="1" geneId="WBGene00000962" protId="CE23997"/>
	<gene id="5" geneId="WBGene00006801" protId="CE43332"/>
</genes>
...
	  
	  

@prefix : <http://miuras.inf.um.es/ontologies/swit/OGO.owl#> .
@prefix OGO: <http://miuras.inf.um.es/ontologies/OGO.owl#> .

Individual: gene_1
    Types: 
        OGO:Gene
		
Individual: gene_5
    Types: 
        OGO:Gene
	  
	  

Attribute rule

A basic definition of an attribute rule looks like this:


<map>
	<type>Arch2Prop</type>
	<source>
		<class>
			<id>http://miuras.inf.um.es/ontologies/OGO.owl#Gene</id>
		</class>
		<arch>
			<nodepath>/orthoXML/species/database/genes/gene/@id</nodepath>
		</arch>
	</source>
	<predicate>
		<id>
			http://miuras.inf.um.es/ontologies/OGO.owl#Identifier
		</id>
	</predicate>
	<target>
		<arch>
			<valuepath>../@geneId</valuepath>
		</arch>
	</target>
</map>
	

The attribute rule is divided in three parts, source, predicate and target. In source we specify a regular entity rule. In predicate we specify the URI of the OWL datatype property for which we want to give a value to our new individuals. In target we specify which data from the input source is going to be transformed into values for the OWL datatype property using the label valuepath. For XML data, valuepath contains an XPath expression that is always a relative path to the one in the source section of the mapping. Note that the XPATH expression /orthoXML/species/database/genes/gene/@id/../@geneId is equivalent to the XPATH /orthoXML/species/database/genes/gene/@geneId.

The previous example will give value to the OWL datatype property Identifier for each Gene individual created. That value will be taken from the attribute @geneId that is accesible in the path /orthoXML/species/database/genes/gene/@id/../@geneId.


...
<genes>
	<gene id="1" geneId="WBGene00000962" protId="CE23997"/>
	<gene id="5" geneId="WBGene00006801" protId="CE43332"/>
</genes>
...
	  
	  

@prefix : <http://miuras.inf.um.es/ontologies/swit/OGO.owl#> .
@prefix OGO: <http://miuras.inf.um.es/ontologies/OGO.owl#> .

Individual: gene_1
    Types: 
        OGO:Gene
	Facts:       
		OGO:Identifier  "WBGene00000962"^^xsd:string
		
Individual: gene_5
    Types: 
        OGO:Gene
	Facts:       
		OGO:Identifier  "WBGene00006801"^^xsd:string
	  
	  

Relation rule

A basic definition of a relation rule looks like this:


<map>
	<type>Arch2Rel</type>
	<source>
		<class>
			<id>http://miuras.inf.um.es/ontologies/OGO.owl#Gene</id>
		</class>
		<arch>
			<nodepath>/orthoXML/species/database/genes/gene/@id</nodepath>
		</arch>
	</source>
	<predicate>
		<id>
			http://miuras.inf.um.es/ontologies/OGO.owl#hasResource
		</id>
	</predicate>
	<target>
		<class>
			<id>http://miuras.inf.um.es/ontologies/OGO.owl#Resource</id>
		</class>
		<arch>
			<nodepath>../../../.</nodepath>
		</arch>
	</target>
</map>
	

The relation rule is very similar to the attribute rule. It is divided in three parts, source, predicate and target. Both source and target specify a regular entity rule defining the individuals that are to be linked, with the particularity that the XPath expression used to get the data in the target is relative to the XPath in the source. Note that the XPATH expression /orthoXML/species/database/genes/gene/@id/../../../. is equivalent to the XPATH expression /orthoXML/species/database. In predicate we specify the URI of the OWL object property used to link the individuals.

The previous example will link a Resource individual with a Gene individual through the hasResource property everytime a gene label is found inside a database label.


...
<species name="Caenorhabditis elegans" NCBITaxId="6239">
	<database name="WormBase" 
	    version="Caenorhabditis-elegans_WormBase_WS199_protein-all.fa">
		<genes>
			<gene id="1" geneId="WBGene00000962" protId="CE23997"/>
			<gene id="5" geneId="WBGene00006801" protId="CE43332"/>
		</genes>
...
	  
	  

@prefix : <http://miuras.inf.um.es/ontologies/swit/OGO.owl#> .
@prefix OGO: <http://miuras.inf.um.es/ontologies/OGO.owl#> .

Individual: gene_1
    Types: 
        OGO:Gene
	Facts:       
		OGO:Identifier  "WBGene00000962"^^xsd:string,
		OGO:hasResource  resource_1
		
Individual: gene_5
    Types: 
        OGO:Gene
	Facts:       
		OGO:Identifier  "WBGene00006801"^^xsd:string,
		OGO:hasResource  resource_1

Individual: resource_1
    Types: 
        OGO:Resource
	  
	  

In progress...