SciELO - Scientific Electronic Library Online

vol.3 número3Electronic Document Interoperability in eBusiness and eGovernment Applications: Guest Editors’ IntroductionService and Document Based Interoperability for European eCustoms Solutions índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados




Links relacionados


Journal of theoretical and applied electronic commerce research

versão On-line ISSN 0718-1876

J. theor. appl. electron. commer. res. v.3 n.3 Talca dez. 2008 


Journal of Theoretical and Applied Electronic Commerce Research
ISSN 0718-1876 Electronic Versión VOL 3 / ISSUE 3 / DECEMBER 2008 /1-16.


A Core Component-based Modelling Approach for Achieving e-Business Semantics Interoperability


Till Janner1, Fenareti Lampathaki2, Volker Hoyer3, Spiros Mouzakitis4, Yannis Charalabidis5 and Christoph Schroth6

1 SAP Research, St. Gallen, Switzerland,
2 Decisión Support Systems Laboratory, National Technical University of Athens, Greece,
3 SAP Research, St. Gallen, Switzerland,
4 Decisión Support Systems Laboratory, National Technical University of Athens, Greece,
5 Decisión Support Systems Laboratory, National Technical University of Athens, Greece,
SAP Research, St. Gallen, Switzerland,


The adoption of advanced integration technologies that enable prívate and public organizations to seamlessly execute their business transactions electronically is still relatively low, especially among governmental bodies and Small and Medium-sized Enterprises (SMEs). Current solutions often lack a common understanding of the underlying business document semantics and most existing approaches are not able to cope with the huge variety of business document formats, stemming from highly diverse requirements of the different stakeholders. Developed and applied in the course of the EU-funded research project GÉNESIS, this paper presents a comprehensive core component-based business document modelling approach that builds upon existing standards such as the OASIS Universal Business Language (UBL) and the UN/CEFACT Core Component Technical Specification (CCTS). These standards are extended by introducing the concept of generic business document templates out of which specific documents can be derived according to the actual user's needs. Key principie to achieve this flexibility is the integration of business context information that allows for modelling standard-based but at the same time customized business documents. The resulting modelling framework ranges from (tool-supported) graphical data models to the technical representation of the business documents as XML schema documents designed in compliance with the UN/CEFACT XML schema Naming and Design Rules (NDR).

Key words: Interoperability, CCTS, UBL, UN/CEFACT, e-Business, e-Government, Data Modelling Methodology.


1 Introduction

Traditionally, enterprises and public organizations have designed and deployed applications and databases in order to cover their specific, highiy diverse requirements without taking into account integration within or across their boundaries. Achieving Business-to-Business (B2B) integration and executing electronic transactions seamlessly does not only mean to realize technical connectivity between systems - this is readily addressed through the use of existing technical standards and support middleware such as Web services [17]. Taking into account that Yankee Group advises IT departments to focus on interoperability technologies and skills as a core competency imperative, envisaging saving more than one-third of the cost if they succeed in achieving business and technical interoperability [31], the interoperability challenge has not yet been accomplished. Especially regarding its semantic dimensión, this can be attributed to the lack of common understanding at the collaborative business process and data level which is caused by the use of different representations, different purposes, different contexts, and different syntax-dependent approaches [30], [20], [19], [3] and [15].

According to the European Interoperability Framework (EIF) [13], semantic interoperability "enables systems to combine received information with other information resources and to process it in a meaningful manner". The need for semantic interoperability is thus driven by the decentralized design of information resources, by differing perspectives inherent to various domains, and by the widespread need to correlate data from increasingly diverse domains concerning especially governmental organizations and small and medium-sized enterprises (SMEs).

However, any measures undertaken in the context of semantic interoperability have to consider key aspects of technical interoperability and have to build upon provided standards, guidelines and solutions. At the same time, the meaning or semantics of data is inherently linked to the purpose (business goals) or context (business processes) in which it is actually used. Consequently, measures in the context of semantic interoperability are closely linked to and even require and imply measures in the context of organizational interoperability [23].

In this context, the authors present a core component-based business document modelling approach that builds upon existing standards such as the OASIS Universal Business Language (UBL) [22], the UN/CEFACT Core Component Technical Specification (CCTS) [33] and the VV3C XML schema ([41], [42], [43]) and that creates communication bridges between business data and process models. These standards are extended by introducing the concept of generic business document templates out of which country specific documents can be derived according to the actual user's needs. Key principie to achieve this flexibility is the integration of business context information that allows for modelling standard-based but at the same time customized business documents. The resulting modelling framework ranges from (tool-supported) graphical data models to the technical representation of the business documents as XML schema documents designed in compliance with the UN/CEFACT XML schema Naming and Design Rules (NDR) [36].

The work presented in this paper has been conducted in the context of the EU-funded research project GÉNESIS: "Enterprise Application Interoperability via Internet-lntegration for SMEs, Governmental Organisations and Intermediarles in the New European Unión" [10]. Project goal is the research, development and pilot application of the needed methodologies, infrastructure and software components that will allow the typical, usually small and médium, European enterprise to conduct its business transactions over Internet, by interconnecting its main transactional software applications and systems with those of collaborating enterprises, governmental bodies, banking and insurance institutions with respect to the EC current legal and regulatory status and the existing one in the new EU, candidate and associate countries.

The remainder of the paper is structured as follows: In the second chapter, issues and challenges in data modeling are discussed providing the state of the art background and related work upon which this work is based. The proposed GÉNESIS Data Modelling towards semantics interoperability is presented in chapter 3. Chapter 4 proceeds with the presentation of the GÉNESIS XML schema library based on the GÉNESIS data modelling approach. A discussion of our results and an outlook to further research activities required in the field of e-Business semantic interoperability complement this work.

2 State of the Art and Related Work

2.1 e-Business Standardization

Today, the prevalent "business standards dilemma" [30], described as the diversity of standards that address particular data requirements, but are designed on such a different basis that make the choice of a specific standard to be adopted a new challenge, has emerged and is compounding the semantics interoperability problem. In the context of this work, a set of industry independent XML Data Standards have been examined- namely xCBL, eBIS-XML, OAGIS, UBL, XBRL, xCBL for the sake of the B2B aspect of the transactions conducted by SMEs.

Commerce eXtensible Markup Language (cXML) (versión 1.2.017, April 2007) [5] has been the outcome of the collaboration of 52 major companies with the aim of providing formal XML DTD (Data Type Definition) schemas for standard business transactions. cXML (commerce eXtensible Markup Language) is a streamlined protocol intended for consistent communication of business documents between procurement applications, e-Commerce hubs and suppliers. It is a protocol that is published for free on the Internet along with its DTD, since each cXML document is constructed based on XML Document Type Definitions (DTDs).

eBIS-XML (Versión 3.09 - March 2004) [6] developed by Business Application Software Developers Association (BASDA) provides a XML-based standard with which business and financial performance information will be defined and exchanged. eBIS-XML has first demonstrated and deployed an interoperable many-to-many eCommerce interface between standard software packages. The BASDA eBIS-XML schema has been developed as an international open standard supporting European, US and Asian requirements based on the VV3C XML Standard as the basis for its message structure, the VV3C XML schema, rather than DTD, as the means of defining the specification for the message and the validation and e-mail as the common delivery mechanism.

Open Applications Group Integration Specification (OAGIS) (Versión 9.1 - May 2007) [21] is an effort to provide a canonical business language for information integration dated back in the fall of 1995. It uses XML as the common alphabet for defining business messages, and for identifying business processes (scenarios) that allow businesses and business applications to communicate. OAGIS is a complete set of XML business messages, and also accommodates the additional requirements of specific industries by partnering with various vertical industry groups. OAGIS provides the definition of business messages in the form of Business Object Documents (BODs) and a set of example business scenarios that provide example usages of the BODs. OAGIS currently includes 434 Business Object Documents fulfilling the need for the definition of business objects in eCommerce, Finance, Manufacturing, Logistics, Customer Relationship Management and Enterprise Resource Planning systems. OAGIS in its latest versión has adopted UN/CEFACT Core Component Technical Specification (CCTS) versión 2.01 and has included the approved harmonized Core Components from UN/CEFACT TBG 17 (Core Component Harmonization work group). OAGIS has also made enhancements to provide better Web services support and has issued guidelines for Web Service Description Language (WSDL that can be used to develop Web services.

eXtensible Business Reporting Language (XBRL) (Versión 2.1 - December 2003) [38] is a language for the electronic communication of business and financial data and is an open standard, free of licence fees. It provides guidelines and methodologies in the preparation, analysis and communication of business reporting information. Business reporting includes, but is not limited to, financial statements, financial information, non-financial information and regulatory filings such as annual and quarterly financial statements. It is being developed by an international non-profit consortium of major companies, organisations and government agencies in order to provide a XML-based standard, with which to define and exchange business and financial performance information.

XML Common Business Library (xCBL) (versión 4.0 March 2003) [39] by Commerce One provides a collection of XML specifications (both DTD and XML schema) for use in e-business transactions. xCBL provides a smooth migration path from EDI-based commerce because of its origins in EDI semantics. It is able to support all essential documents and transactions for global e-Commerce including multi-company supply chain automation, direct and indirect procurement, planning, auctions, and invoicing and payment in an international multi-currency environment, while it represents an initial alignment with the OASIS Universal Business Language (UBL) initiative, since some of the UBL recommendations have been adopted in the design of xCBL4.0.

The Universal Business Language (UBL) [22] supported by the Organisation for the Advancement of Structured Information Standards (OASIS) is a royalty-free library of standard electronic XML business documents, designed to provide a universally understood and recognized commercial syntaxfor legally binding business documents.

UBL (Versión 2.0 - December 2006) provides the following:

• A library of XML schemas for reusable data components, such as Address, ítem and Payment, which are the common data elements of everyday business documents.

• A set of XML schemas for 29 common business documents such as Order, Despatch Advice and Invoice that are constructed from the UBL library components and can be used in generic procurement and transportation contexts.

• A set of processes and business rules associated with the business documents identified that define a context fortheir use.

UBL operates within a standard business framework such as ISO 15000-5 (UN/CEFACT CCTS) in order to provide a complete, standards-based infrastructure that can extend the benefits of existing EDI systems to businesses of all sizes. As the first standard implementation of UN/CEFACT CCTS, the UBL Library is based on a conceptual model of information components known as Business Information Entities. These components are assembled into specific document models such as Order and Invoice, which are then transformed in accordance with UBL Naming and Design Rules into W3C XML schema syntax.

To this end, a comparative analysis among the aforementioned Business-to-Business (B2B) data standards [9], [10] has been conducted on the basis of requirements related to the project's scope, like countries parameterisation / variance, multilingual aspects, maturity, support of GÉNESIS Core Technologies (i.e. XML), use of Data Components, standard support: international and community, ease of use and implementation, document scope/ type, support for modelling of rules, workflow capabilities of the document, modelling of messages, relation to enterprise model, management of models, management of "standard" XML metadata / schemas, configuration management, dynamic configuration capabilities, capability of incorporating business rules into the schemas, modularity, expandability, composability and licensing.

The conclusión reached by the evaluation of the standards has designated UBL as the standard upon which the GÉNESIS Data Modelling shall be based. UBL has been chosen in favour of the UN/CEFACT CCL and NDR mainly due to the rich UBL Core Component Library already existing. UBL can be characterized as an emerging standard that has the credentials to domínate the área of Data Modelling, but to date its scope is rather limited as it covers only the basic business documents involved in a few common B2B processes. For instance, there is a whole range of business documents before and after a payment scenario that UBL does not cover. Moreover, UBL currently does not support transactions between businesses and the government or banking institutions. Therefore, taking into account that UBL's customization is relatively rigid, the need for a innovative modelling approach, that on the one hand takes advantage of UBL' s offerings but on the other, extends its scope and usage emerges.

To our knowledge, UN/CEFACT has also released an International e-lnvoice (Cross-lndustry electronic Invoice - CU) [34] designed for use by the Steel, Automotive, or Electronic industries, as well as in the retail sector or Customs and other Government Authorities. However, since its first schemas were published in draft versión in April 2007, it runs its inception phase and was not thus taken into account in the GÉNESIS evaluation phase.

2.2 UN/CEFACT e-Business Stack and Core Components Technical Specification (ISO 15000-5)

The UN/CEFACT (United Nations Centre for Trade Facilitation and Electronic Business) Core Component Technical Specification (CCTS, also known as ISO 15000-5) established in November 2003 (Versión 2.01) [33] presents a methodology for developing a common set of semantic building blocks that represent the general types of business data in use today. It can be understood and interpreted by humans and machines in the same way while it provides means for the creation of new business vocabularies and restructuring of existing ones. In particular, it defines meta-models and rules necessary for describing the structure and contents of conceptual and physical/logical data models, process models, and information exchange models.

The UN/CEFACT provides several specifications that can be assembled into a modular and comprehensive e-Business stack which is depicted in Figure 1. Recent research publications thoroughly describe this framework and its architectural ideas on how to apply the stack in e-Business environments to support semantic and application interoperability ([27], [14]).

In the course of this work, we focus on the major building block of the e-Business stack, the Core Component Technical Specification, which is described briefly in the following. The purpose of using Core Components as part of the ebXML framework is to ensure that two trading partners using different syntaxes (e.g. XML and United Nations/EDI for Administration, Commerce, and Transport (UN/EDIFACT) [37] are using business semantics in the same way on condition that both syntaxes have been based on the same Core Components. This enables olean mapping between disparate message definitions across syntaxes, industry and regional boundaries. CCTS extends the Core Component and the Business Information Entity layers by defining syntax neutral Core Data Types as the smallest and most generic pieces of information in a business data model. Also defined are eight unique and orthogonal context categories - business process, product classification, industry classification, geopolitical, official constraints, business process role, supporting role, system capabilities - which describe the circumstances in which a business collaboration or data use takes place. In the área of CCTS, we refer to the work of Stuhec [30] who describes the Context-Driver-Principle for the efficient use and customization of CCTS-based business information.The UN/CEFACT Naming and Design Rules (NDR) [36] that accompany the CCTS specification define a set of guidelines for transforming CCTS based artefacts into XML schema and XML based instances. Such Naming Conventions are necessary to gain consistency in naming and defining Core Components, Data Types and Business Information Entities.

The UN/CEFACT Core Component Library (UN/CCL) [32] represents the repository for generic business data components, the so called Core Components. Based on the experiences gained in previous data standardization efforts, the CCL does not provide pre-determined, static or industry-specific data definitions, but comprises a huge set of context-agnostic, generally valid data templates (e.g. postal address, personal information) that are syntax-independent and represent the general business data entities which are commonly used in today's business processes. Major benefits of leveraging such a Core Component Repository include an increased reuse of data elements during modelling and improved enterprise interoperability due to a common basis for business information description. UN/CEFACT envisions this library to grow and also change over time as users can either modify existing components or design and submit new Core Components in case the existing ones are not sufficient to fulfil the actual business requirements.

2.3 Requirements of Business Document Modelling

The definition of the characteristics of the business documents that are to be exchanged among businesses and public organizations during the execution of business processes is a key issue for achieving semantic interoperability. In the context of the GÉNESIS project, business document modelling requirements were initially gathered by conducting an online survey among the project partners. The requirements were refined by discussions and interviews with the user partners and their technical staff. Overall, these requirements addressed common data modelling requirements found in the literature, such as low complexity of the document elements, ease of use, flexibility and optimum document size in order to achieve high performance during the exchange of these business documents.

Further requirements were addressed during the definition and creation of the business documents. More specific, user modelling experiences taken into account in the GÉNESIS project have emphasized on three major requirements:

First, reusability of knowledge compensates the missing modelling know-how of employees at SMEs who are generally characterized as all-rounder or multi-task worker. In contrast to large enterprises employing specialists for modelling business processes SMEs do not have these resources. Therefore by means of templates and best practices the user is relieved from routine modelling activities. Additionally by supporting a new concept transforming the software paradigm Design Pattern [8] to collaborative processes it renders assistance to model process transactions among business partners. Besides this so-called business transaction design pattern [35] a central repository for process and data building blocks enables a high reusability by adapting, adding or creating collaborative business processes.

Search criteria combined with a specified context, for instance the country, business partner role or industry, lead to the second requirement of the user perspective, the integrated semantic. A common understanding with regard to description of processes and data across enterprise and even department borders is a prerequisite to solve this business dilemma [30]. A concept handling the accustomed terminology of the different user groups (e.g. a business expert speaks another language than an IT specialist) is required to reduce misunderstandings and potential sources of error. Especially the growing international orientation of SMEs increases this phenomenon due to regional, cultural or language differences as modelling experiences in GÉNESIS have shown. As a result of this additional information overload the modelling approach has to provide only the relevant information according the users' specific context. Unnecessary information has to be hidden.

Combined with the above requirement for an integrated semantic, reusability leads to the demand for reduced complexity, the third and last requirement of the user perspective [7]. Different levéis of abstraction and granularity as well as different views on business processes provide múltiple modelling environments corresponding to the respective user groups. In this sense business people view a company as a set of processes that genérate and consume different kind of flows and are carried out by resources. From another perspective, IT people interpret a company as a set of information systems or services [2].

2.4 GÉNESIS Project Overview

The EU-funded GÉNESIS project [10] aims at increasing the adoption of e-Business specifically among SMEs. Major goal within the project is the design and the implementation of an interoperable platform that enables seamless and cross-organizational collaboration among business partners. Based on the Balanced Scorecard approach, earlier research results analyze economic benefits for SMEs when participating at those platforms: SMEs may capitalize from reduced operating costs and improved gains (financial perspective), increased customer satisfaction and retention (customer perspective), faster and more efficient internal processes (internal working process perspective), improved supply chain integration (supply chain perspective), and technological advancements (system benefits) [12].

In the course of the project, we examine and model the processes and information entities exchanged by the SMEs with other enterprises (B2B), governmental institutions (Business-to-Government: B2G) or Intermediarles (Business-to-lntermediary B2I, e.g. Banks) following an integrated modelling approach for collaborative business processes and business documents (detailed presentation in Chapter 3).

From an architectural perspective, the GÉNESIS system can be described as follows: A central server and decentrally installed adapters and/ or Web clients provide the necessary functionality to connect the users to the system, to manage the communication between them and to provide an environment for business processes and document modelling and execution. Apart from that, the server follows a store-and-forward approach that takes into account that SME users are not necessarily continuously connected to the Internet. The server is able to temporarily store business messages and forward them to other users as soon as they are connected. The server also contains a registry and repository part (containing the GÉNESIS Common Library (CL)), which is devoted to storing templates for the supported processes and business documents (both graphical and XML schema representations) of all registered SMEs and supporting the negotiation of business conditions among SMEs. When connecting to the system, clients are supposed to send their respective user context (formalized according to the existing guidelines of the CCTS) and the ñames of supported processes to the server. The server then passes back process definitions and XML schemas (representing the business documents) to the client that are appropriate for the individual context. Further, clients are enabled to connect to the system in two different ways. A fat client, called adapter, can be installed on the side of the users to connect legacy IT systems to the GÉNESIS system. This adapter realizes a machine-to-machine interface for an efficient automated transaction processing. In case users do not wish to leverage existing IT systems, a Web client is provided as a machine-to-human interface to access the system.

Benefits of such a "hybrid" approach of central and decentral components include the ability to shield technical peculiarities of local IT installations via web service based adapters on client side and to ensure seamless collaboration of the diverse stakeholders, at the same time providing the reliability of a central server that enables users to register themselves, to configure supported processes and data as well as to negotiate the exact conditions of an electronic business relationship. We refer to recent research results of the authors which provide more details with respectto architectural considerations (e.g. [24], [4], [28], [11], and [25]).

3 GÉNESIS Data Modelling Framework

3.1 Overview of the Data Modelling Approach

The main purpose of our data modelling approach is to achieve semantic interoperability among businesses and public organizations. The essential concept of the data modelling approach developed in the course of the GÉNESIS project is the utilization of reusable data components in order to allow for reusability of data elements, an improved interoperability between the various stakeholders, and to avoid transaction errors due to ambiguous notation. This concept is closely aligned to and based on the CCTS concepts described in section 2.2. The overall approach comprises four phases and is depicted in Figure 2.

The four-step approach has been applied for the modelling of business documents exchanged within different kinds of collaborative business processes (B2B, B2G and B2I). In a first step, the approach starts with capturing the 'as-is' situation of the document exchange between the enterprises, governments and/ or financial institutions. With the help of online surveys and end-user workshops, business requirements are gathered and so-called unstructured business documents are modelled.

Based on this collection of raw information, Specific Structured Business Documents (SBDs) are created in a graphical modelling environment, which allows or an integrated modelling of processes and data [26]. These SBDs still represent business documents that are modelled according to the specific and individual needs of the users, but the document structure and the reusable components out of which the business documents are assembled from are modelled following the CCTS specification and the use of the context-principle to include contextual information (like, e.g., geopolitical, industry, or system capabilities). Further, the structures of the UBL business document library are used as templates during the modelling phase to ensure high interoperability and reusability of data components. Figure 3 depicts the integrated modelling methodology which was implemented on the basis of ADONIS®. It can be seen as an integrated modelling environment [16] that has been customized according to the GÉNESIS modelling method. An excerpt of a collaborative business process scenario is shown and the directly integrated business document modelling environment is shown exemplarily.

The third step of the modelling approach comprises the generalization of SBDs into Generic Structured Business Documents (GBDs). The GBDs serve as generic témplate documents that can be regarded as supersets of the numerous specific documents. They still include the context information from the specific documents which enables users of generic business documents to clearly identify the relevant business information (data components) that apply to a specific usage scenario.

As a last step, a methodology has been defined in order to serialize the generic document models into XML schemas. On XML syntax level, the UN/CEFACT specifies a set of so-called Naming and Design Rules (UN/CEFACT NDR) which we make use of to créate the generic XML-schema documents. Under consideration of the above-mentioned context principie, we intégrate our context categories into the XML schema documents in order to make the context information available on the level of the technical representation of the business documents as well. The standard complaint extensión and restriction mechanism and the context annotation of the XML schema documents enables mapping engines to leverage the contextual information to allow for a semi-automatic mapping of business documents between business partners from different contexts.

The next sections in this paper provide further detail on the GÉNESIS data modelling approach focusing on the visual assembly of business documents and the serialization of them to XML schema documents.

3.2 Visual Assembling of Contextualized Business Documents

In this section, we provide details about our approach to incorpórate the business document modelling approach into the ADONIS® modelling environment. The establishment of a common repository of information modelling building blocks and an intuitive modelling approach that guides users along an unambiguous methodological path are the crucial factors for facilitating business information interoperability. The following paragraphs highlight the cornerstones of our newly developed method that offers an integrated modelling of the relevant aspects in the área of business document modelling which comprise the elements and their relations among each other: business document is the key element between processes, rules and information entities; the document is used by actors within activities of a business process and business/legal rules may impact the use or the content of a document, composed of business information entities.

To support the graphical modelling of CCTS based information entities, a new model type called "Business Information Model", which offers both intuitive and comprehensive description of business documents, has been implemented in the modelling environment. The following three main artefacts were developed and used during business document modelling:

The Specific Business Document (SBD) summarizes information about a complete and user-specific document and is referenced in user-specific process models. SBDs include context information that specifies the business environment in which they can be used.

The Generic Business Document (GBD) can be considered a consolidated versión of several user-specific documents and features all data Ítems that occur in any of the affiliated SBDs. The idea behind the establishment of GBDs was to créate data templates that can be used by all clients and only need to be restricted according to user context to exactly match the respective users' business requirements. GBDs are then referenced from harmonized collaboration process model. They also include contextual information to allow a (potentially) automated derivation of specific business documents.

Finally, Business Information Entities (BIEs) are those components that are used to assemble both SBDs and GBDs. As defined in the CCTS meta-model, Basic Business Information Entities (BBIEs), Aggregated Business Information Entities (ABIEs), and Association Business Information Entities (ASBIEs) are the reusable building blocks that are used by the CCTS meta-model (We refer to[33] for detailed information on the CCTS meta-model). We also extend these modelling elements by implementing the context principie to allow for contextualized business document elements also on lower and more fine-grained leveis.

Figure 4 visualizes the graphical representation of an ABIE: the Business Information Entities, their occurrence and information about their usage in different contexts can be specified in an intuitive manner. The documentation of BIEs is supported by using an alternative, spreadsheet-like feature to enter all CCTS-compliant information.

The modelling of complex business documents can comprise of a large number of different information entities and can be a complex and time consuming task, also for modelling experts. To facilítate the modelling of business documents, our environment provides a library of reusable business document building blocks that are based on the UBL common library. Via this approach, modellers can leverage the professional business expertise inherent to the UBL framework, with regards to structure and content of business documents. They can choose from existing data components (e.g. an address), and use them via referencing them in their actual model of a business document. If required, the data component templates can be further adapted to the needs of the specific end-user requirements by reducing superfluous data elements or by adding new data elements.

During the creation of GBDs, the models of SBDs of the same document type (e.g. Order) are consolidated into one generic document by building the super-set of the SBDs. During this step, the contextual information is added to the GBDs in order to allow a derivation of the originally user-specific SBDs and to determine why specific data elements are part of the GBDs. This information is very useful for the later use of the business documents, especially during the electronic exchange of them between different end-users' systems. The efforts for implementing interfaces and mappings for the connected systems can be reduced in terms of time and costs [30]. In Figure 5, the use of contextual information and the assembly of a generic business document are shown for a (very) simplified order document and the usage of one context category (country context). The Greek order document consists of a purchase order number, based on an "Identifier" CCTS Core Data Type, and a customer reference, modelled as a "Text" Core Data Type. In case of an Austrian order document, we assume that the order includes one additional data element, the issue date (on the basis of a "Date" Core Data Type). Further, the customer reference was modelled using a "Code" Core Data Type. The generic business document now includes all data elements and also context valúes that specify in which country a data element is used.

With regards to the usage of UBL templates in the graphical modelling environment, our approach is restricted by certain limitations. In case of business documents, for which UBL does not provide any templates (mainly the case for B2B and B2I business documents), the structured SBDs have to be created from scratch. Via this approach, the resulting SBDs might differ significantly in their structure, which makes the consolidation to GBDs a bit more complex. However, once a first generation of GBDs exists for these kinds of business documents, users can use the GBDs as templates to model their SBDs and proceed exactly as in the case of UBL templates.

During the modelling activities of the GÉNESIS project, a comprehensive repository of specific and generic business documents was built that provides contextualized business documents for supporting different collaboration process types. In addition to the UBL library which focuses mainly on B2B related business documents, the developed GÉNESIS business document library supports also collaboration process types of the área of B2G and B2I. In the case of the e-Government related business documents, we decided on the basis of the documents' prestige and frequency to initially applying the GÉNESIS Data Modelling Methodology to: Periodic VAT Statement, Annual VAT Statement, INTRASTAT Statement - Arrivals, INTRASTAT Statement - Dispatches, Social Security Statement and Declaration of a new employee. The Aggregate Business Information Entities (ABIEs) identified in the aforementioned documents were assembled into generic components, leading to a repository of e-Government components (see also [18]). The generic components may have been extracted by more than one ABIE, in the case that conflicts in the ñames or the context were detected (e.g. the component person represents the common denominator of the Periodic and Annual VAT Statement, the INTRASTAT Statement and the Social Security Statement).

A further demonstration of the modeling capabilities of our approach and a comprehensive example for a business document of the GÉNESIS e-Government repository can be found in [18], where a "real world" example of a business document of a "VAT Statement" in the Greek context was modelled. Even though the overall document structure was created from scratch following the described modelling approach, certain elements of the business documents could be reused from the existing témplate library, e.g. the party or the address data components. This again demonstrates the benefits of a core component based modelling approach.

3.3 Context-dependent XML Schema Representation

After having elaborated in the visual assembly of business documents following the GÉNESIS data modelling methodology, we now present our approach on how to represent contextualized business documents (both specific and generic) on XML schema level. The technical representation and application of the concepts described above is a key to achieve technical interoperability between the end-users' heterogeneous systems. The developed XML schema business documents which are based on standards like the UN/CEFACT NDR and UBL Common Library provide common syntax and semantic to allow for an efficient and interoperable exchange of business documents.

On XML schema level, we also distinguish between generic and specific XML schema documents: Figure 6 depicts the overall relations between generic and specific XML schema documents.

The generic XML schema documents are a XML-representation of the generic data models that were modelled and graphically represented in the ADONIS environment. They include the representation of the context information modelled in ADONIS which are realized via XML annotations to all data components and business documents. The basic components for the message-assembly are the UN/CEFACT Core Data Types (CDT) [40], which are referenced by the GÉNESIS reusable Aggregated Business Information Entities. These reusable components are again referenced by the main documents to assemble the overall structure of the generic business documents.

The specific XML-schema documents can then be derived from the generic XML schema. Prototypical transformation scripts (based on XSL Transformations (XSLT)) were developed and successfully tested during the project work. The derived specific XML-schema document represents a business document according to a user's specific context, e.g. an order document valid for the country-context "Greece" and for the industry "Manufacturing". All XML documents have been created according to the UN/CEFACT NDR [36] - including additional context information (via annotations).

Further, the GÉNESIS XML schema business documents are characterized as follows:

GÉNESIS Namespace. An XML Namespace is a VV3C standard for providing uniquely named elements and attributes in an XML instance. The namespace concept ensures to avoid naming conflicts that might occur due to the fact that XML instances may contain element or attribute with equal ñames from more than one XML vocabulary. It resolves the possible ambiguities between identically named elements or attributes by giving a namespace to each vocabulary. The namespace concept is essential to realize the component-based data modelling approach of CCTS and also the GÉNESIS meta-model for business information. XML namespaces enable the creation of so-called schema modules and the definition of reusable data building blocks. The GÉNESIS namespace convention for the creation of the GÉNESIS XML schema documents are closely aligned with the namespace concepts defined in the UN/CEFACT NDR.

In the following example for the GÉNESIS "Order" document, the different namespaces are defined for the later import of the referencing schema modules (see Table 1). We used the abbreviations for the namespaces as defined by the UN/CEFACT NDR. The only extensión of the UN/CEFACT NDR standard is the namespace of the reusable ABIE (xmlns:ram), the XML schema library of the reusable data components created specifically for our project. This extensión enables a better readability and logical conciseness of the implementation of the GÉNESIS meta-model in XML schema.

Context Approach in GÉNESIS XML-schema. CCTS proposes the use of context categories and context valúes to enable data models according to specific user requirements. The graphical representations of contextual information of the generic and specific documents in the ADONIS environment were modelled according to this principie. On XML schema level, we utilize the notation proposed in the UN/CEFACT NDR. We therefore defined the context information as a sepárate XML-schema document, where the context categories are defined as complex types. The possible valúes for the different context categories are again defined separately and leverage existing and standards-based Valué-/ Code-Lists where possible (e.g. UN/CEFACT code list "Country Code" from ISO, Versión 2004-09-14). Figure 7 shows the structure of the GÉNESIS context categories and the referenced code lists on XML schema level:

The GÉNESIS context categories used to categorize the data entities of the generic documents according to the users' requirements during the GÉNESIS project included "geopolitical" and "system capabilities" context categories. The geopolitical context category enables the modelling of data entities with respect to their country specific characteristics. This context category is exactly the same as defined by UN/CEFACT CCTS. Possible valúes are defined by UN/CEFACT code list "Country Code" of ISO, versión 2004-09-14. The system capability context category was created to take into account system specific characteristics of data entities and their representations in enterprise transaction systems of different vendors. We established this context category in a first versión with valúes that describe the vendors of the IT systems used by the GÉNESIS partners.

4 GÉNESIS XML Schema Library

Based on the GÉNESIS Data Modelling Framework, we have modelled and created more than 35 business document types in different contexts (countries as well as backend systems) and for different business processes (626: Créate Catalogue, Sourcing Buyer Initiated, Ordering, Billing; 62G: Periodic VAT Statement, Annual VAT Statement, Intrastat Statement, Social Security, Employee Contracting; 62/: Bank Payment). In contrast to UBL and its limitation to B2B business processes and the CCTS approach focusing solely on the actual core component principie but not providing complete business documents, the GÉNESIS XML schema library represents a first implementation of a context-dependent core component-based approach covering B2B, B2G, as well as B2I business documents.

4.1 Overview

The GÉNESIS business documents utilize the XML representation to specify the syntactic and semantic structure of the business documents. However, to manage the envisioned evolving GÉNESIS platform, a dynamic structure of the XML schema library is required. As depicted in Figure 8, the GÉNESIS library is separated in three layers: Main documents, reusable data building blocks and core data types pre-defined bythe UN/CEFACT:

Main Documents: The main documents of the first layer represent the different business document types (i.e., Quotation, Order, Invoice, Annual VAT Statement, Instrastat Form, etc.) and are assembled by referencing the necessary reusable Aggregated Business Information Entities (ABIE) and Core Data Types (CDT). Therewith, each business document is represented by a XML schema file which references reusable data building blocks by using the XML importfunctionality.

Reusable Data Building Blocks: More than 70 reusable building blocks have been identified during the extensive data modelling activities in frame of the GÉNESIS project (i.e., Address, Address Line, Company, Contact, Document Reference, Employee, Financial Account, Financial Institution, Legal Total, Payment Term, Period, Person, Invoice Line, Signature, Social Security Authority, Taxation Period, VAT Code, etc.). These reusable data building blocks are stored in the second layer, whereas the ABIE data components are also referencing the required Core Data Types (CDT). As depicted in Figure 8, reusable data building blocks are reused by the main documents as well as by each other. For example, the Address is required in an Order document as well as in an Annual VAT Statement or in a Bank Transfer Form.

Core Data Types (CDT): Pre-defined by the UN/CEFACT, the third and last layer builds the foundation of the GÉNESIS core component data modelling approach. The UN/CEFACT CDT defines the smallest and generic (without any semantic meaning) pieces of information in the GÉNESIS business document with relevant characteristics. In this way, UN/CEFACT has created an unambiguous basis of atomic business information parts up to a complete business document according to the rules of CCTS. Therewith, the GÉNESIS XML schema library is based on 21 standardized and well established data types (Amount, Binary Object, Code, Date, Date Time, Duration, Graphic, Identifier, Indicator, Measure, Ñame, Numeric, Percent, Picture, Quantity, Rate, Ratio, Sound, Text, Time, Valué, and Video [32].

As indicated in Figure 8, the context-dependent representation of the generic business documents is applied by the main documents and the reusable data building blocks. The GÉNESIS XML schema library contains only generic XML schemas enriched with context dependent information which allows the derivation of specific business documents according to the relevant context.

4.2 Example

By means of an example B2G business document (Annual VAT Statement), this section is devoted to ¡Ilústrate the technical implementation of the GÉNESIS XML schema library. We chose a B2G document to demónstrate the synergy effects of an integrated library approach covering B2B, B2G, as well as B2I business processes. Reusable data building blocks of the B2B context (Le., address or period) can be reused in the B2G context as well.

The XML code depicted in Figure 9 in the below represents the Annual VAT Statement main document following the UN/CEFACT NDR [36]. The NDR define a set of rules for transforming CCTS based artefacts into XML schema and XML based instances. To reference the reusable data building blocks and the core data types, files are imported and relevant namespaces are defined. We used the abbreviations for the namespaces as described above. The only extensión of the UN/CEFACT NDR standard is the namespace of the GÉNESIS reusable ABIE (xmlnsiram) and the GÉNESIS context categories (xmlnsicda) as indicated in ellipse number 1 of Figure 9. As described earlier, it enables a better readability and logical conciseness of the implementation of the GÉNESIS meta-model in XML-schema. In case of the Ássociated Basic Information Entity ram:TaxationPeriod, for example, we refer to the GÉNESIS namespace of the reusable ABIE (ellipse number 2, Figure 9) which specifies the taxation period.

According to the GÉNESIS Data Modelling Framework, each BBIE (Year) and ASBIE (ram:TaxationPeriod) are extended with context information (cda:BusinessContext). In case of the Annual VAT Statement, the GÉNESIS context categories allow the differentiation between Greece and Cyprus without modelling the business document twice. In Cyprus, the yea/" element is not relevant whereas the element is mandatory in the Greece context (ellipse 3). On the other hand, the associated TaxationPeríod is relevant in both countries (ellipse 4, Figure 9). By adding country contexts to a BBIE or ASBIE, the GÉNESIS library can be easily extended and is prepared for an evolution over time. In addition, new elements can be added with their relevant context category without changing the structure of the existing business document.

The referenced TaxationPeríod of the Annual VAT Statement is based on the PeríodType complex type specified in the GÉNESIS Reusable Data Building Blocks (ABIE). As depicted in Figure 10, the GÉNESIS reusable ABIEs are structured equally to the main documents. In this sense, the Períod contains of different BBIEs such as EndDate and Month. To specify the data type of the EndDate element, for example, we refer to the CCTS Core Date Type udt:DateType (ellipse number 5 of Figure 10).

5 Conclusions

Motivated by the increasing need to achieve semantic interoperability among SMEs, Governmental Bodies and Banking Institutions, this paper has presented an integrated data modelling approach that links processes and documents and at the same time takes into account the diverse stakeholders' requirements. The proposed framework builds upon existing standards, such as the Universal Business Language and the UN/CEFACT CCTS, which were designated by the evaluation of the state of the art. The concept of contextualized generic business data models, which are modelled as a superset of the respective specific business documents, has also been introduced and allows for the derivation of specific data models according to business context in which the document is used. The framework does not only comprise graphical representations of the business document models, but also their technical representation in terms of XML schema documents. As a result, a comprehensive business document repository has been developed, that comprises 35 generic business document types ranging from B2B, B2G, and B2I documents that were assembled out of more than 100 specific business documents. Furthermore, a comprehensive library of reusable business information entities has been developed that allows to easily modelling new business documents leveraging the existing business knowledge to ensure a efficient and sustainable growth of the GÉNESIS business document library.

Future steps of our work include the integration of additional business processes with their corresponding business documents as well as the coverage of additional countries and other system vendors. Currently, a prototypical instance of the GÉNESIS platform is under development on which the modelled collaboration processes and business documents will be tested during pilot operation.


This paper has been created closely to research activities during the EU-funded project GÉNESIS (Contract Number FP6-027867).


[1] C. C. Albrecht, D. L. Deán, and J. V. Hansen, Marketplace and technology standards for B2B e-commerce: progress, challenges, and the state of the art, Information & Management, vol. 42, pp. 865-875, 2005.

[2] V. Anaya, A. Ortiz, How enterprises can support integration, in Proceedings of the First International Workshop on Interoperability of Heterogeneous Information Systems, 2005.

[3] M. Chen, Factors affecting the adoption and diffusion of XML and Web services standards for E-business systems, Int. J. Human-Computer Studies, vol. 58, pp. 259-279, 2003.

[4] O. Christ, C. Schroth, and T. Janner, A Hybrid Framework for Automated and Adapative e-Business Platforms, in Proceedings of the 15th European Conference on Information Systems, 2007.

[5] cXML Versión 2.0.17, [Online], Available:

[6] eBIS-XML Suite Versión 3.09, [Online], Available: http://www.basda.orq/VD65/default.asp?PSID=51.

[7] M. Fricke, K. Goetz, T. Renner, A. Polz, Studie e-business barometer 2006/2007 [Online], Available:

[8] E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns. 2nd ed. Addison-Wesley, Reading et al., 1994.

[9] GÉNESIS Deliverable D3.1: Analysis of the Data Modelling State of the Art, June 2006.

[10] GÉNESIS FP6 Project (2008). [Online], Available:

[11] G. Gionis, Y. Charalabidis, T. Janner, C. Schroth, S. Koussouris, D. Askounis, Enabling Cross-Organizational Interoperability: A Hybrid e-Business Architecture, in Proceedings of the 3rd International Conference on Interoperability for Enterprise Software and Applications (l-ESA 2007) and: Enterprise Interoperability II. New Challenges and Approaches (R. J. Goncalves, J. Müller, K. Mertins, M. Zelm, Editors), Springer, 2007.

[12] V. Hoyer, T. Janner, P. Mayer, M. Raus, and C. Schroth, Small and Médium Enterprise's Benefits of Next Generation e-Business Platforms, The Business Review, Cambridge, 2006.

[13] IDABC, European Interoperability Framework for pan-European e-Government Services, Versión 1.0, [Online], Available:

[14] T. Janner, A. Schmidt, C. Schroth, and G. Stuhec, From EDI to UN/CEFACT: An evolutionary path towards a next generation e-business framework, in Proceedings of The 5th International Conference on e- Business 2006 (NCEB2006), Bangkok, 2006.

[15] S. Jauhiainen, O. Lehtonen, P.-P. Ranta-aho, and N. Rogemond, B2B Integration - past, present, and future, [Online], Available:

[16] H. Kühn, F. Bayer, S. Junginger, D. Karagiannis, Enterprise Model Integration, in Proceedings of the 4th International Conference EC-Web 2003, Dexa 2003, Prague, Czech Republic, September2003.

[17] B. Medjahed, B. Benatallah, A. Bouguettaya, A. H. H. Ngu, A. and K. Elmagarmid, Business-to-business interactions: issues and enabling technologies, The VLDB Journal, vol. 12, pp. 59-85, 2003.

[18] S. Mouzakitis, F. Lampathaki, C. Schroth, U. Scheper, T. Janner, Towards a common repository for governmental data: A modelling framework and real world application, in: Proceedings of the 3rd International Conference on Interoperability for Enterprise Software and Applications (l-ESA 2007) and: Enterprise Interoperability II. New Challenges and Approaches (R. J. Goncalves, J. Müller, K. Mertins, M. Zelm, Editors), Springer, 2007.

[19] J.A. Mykkanen, M.P. Tuomainen, An evaluation and selection framework for interoperability standards, Information and Software Technology, doi:10.1016/j.infsof.2006.12.001, 2007.

[20] J.-M. Nurmilaakso, P. Kotinurmi, and H. Laesvuori, XML-based e-business frameworks and standardization, Computer Standards & Interfaces, vol. 28, pp. 585-599, 2006.

[21] OAGIS Versión 9.1, [Online], Available: http://openapplications.Org/oaqis/9.1/index.html.

[22] OASIS, Universal Business Language (UBL) Versión 2.0, Standard December 2006, [Online], Available: http://docs.oasis-open.Org/ubl/

[23] Qualipso, Deliverable 3.2.1b Semantic Interoperability, [Online], Available:

[24] B.F. Schmid, C. Schroth, and T. Janner, A Hybrid Architecture for Highly Adaptive and Automated e-Business Platforms, in Proceedings of the 2007 IEEE International Conference on Services Computing (SCC 2007), IEEE Computer Society, 2007.

[25] C. Schroth, G. Gionis, Y. Charalabidis, T. Janner, S. Koussouris, and D. Askounis, A Hybrid Architecture for Enabling Electronic Transactions Among Enterprises and Governmental Bodies, in Proceedings of the 6th International Conference on Practical Aspects of Knowledge Management PAKM, 2006.

[26] C. Schroth, G. Pemptroad, T. Janner, CCTS-based Business Information Modelling for Increasing Cross-Organizational Interoperability, in Proceedings of the 3rd International Conference on Interoperability for Enterprise Software and Applications (l-ESA 2007) and: Enterprise Interoperability II. New Challenges and Approaches (R. J. Goncalves, J. Müller, K. Mertins, M. Zelm, Editors), Springer, 2007.

[27] C. Schroth, T. Janner, and G. Stuhec, UN/CEFACT Service-Oriented Architecture: Enabling Both Semantic And Application Interoperability, in Proceedings of the symposium "Communication in Distributed Systems" (KiVS), Workshop: Service-Oriented Architectures und Service-Oriented Computing, VDE Verlag, 2007.

[28] C. Schroth, T. Janner, A. Stage, and P. Mayer, A Holistic Architecture for Collaborative and Highly Automatized e-Business Platforms, in Proceedings of the Second International Workshop On Services Engineering, pp. 355-362, IEEE Computer Society, 2007.

[29] G. C. Simsion, G. C. Witt, Data Modelling Essentials, Third Edition, Morgan Kaufmann Publications, Elsevier, 2005.

[30] G. Stuhec, How to solve the Business Standards Dilemma - the Context Driven Business Exchange, SAP Developer Network, 2005.

[31] The Yankee Group Report, Interoperability Emerges as New Core Competency for Enterprise Architects, [Online], Available:

[32] UN/CEFACT Core Component Library (UN/CCL), versión 1.0, [Online], Available: index.htm.

[33] UN/CEFACT Core Components Technical Specification, Part 8 of the ebXML Framework, Versión 2.01 (November 2003), [Online], Available:

[34] UN/CEFACT:Cross Industry electronic Invoice, [Online], Available:

[35] UN/CEFACT:Modeling methodology (UMM). [Online] Available:

[36] UN/CEFACT XML Naming and Design Rules, Versión 2.0 (February 2006), http://www.unece.orq/cefact/xml/XML-Naminq-and-Desiqn-Rules-V2.0.pdf.

[37] United Nations Directories for Electronic Data Interchange for Administration. (UN/EDIFACT), [Online], Available:

[38]XBRL Versión 2.1, [Online], Available:

[39] xCBL Versión 4.0, [Onliine], Available:

[40] XML schema for UN/CEFACT Core Data Types, [Online], Available: downloads/www.unece.orq schemas

[41] XML schema Part 0: Primer (Second Edition), VV3C Recommendation 2004, [Online], Available: http://www.w3.orq/TR/xmlschema-0/.

[42] XML schema Part 1: Structures (Second Edition), VV3C Recommendation 2004, [Online], Available:http://www.w3.orq/TR/xmlschema-1/.

[43] XML schema Part 2: Datatypes (Second Edition), VV3C Recommendation 2004, [Online], Available: http://www.w3.orq/TR/xmlschema-2/.

Received 30 April 2008; received in revised form 6 September 2008; accepted 6 October 2008.


Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons