EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards v1[1].1.doc

47
SOA WITH JCAPS AT EGL Architecture Principles, Guidelines and Standards Version 1.1 9/17/2007 document.doc Page 1 of 47 6/27/2022

description

EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards v1[1].1.doc

Transcript of EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards v1[1].1.doc

SeeBeyond ProServ Generic Document Template

SOA with JCAPS at EGLArchitecture Principles, Guidelines and StandardsVersion 1.19/17/2007Version Description

AuthorDate CreatedLast UpdateVersion

Xi SongMarch 30, 20071,0

Oziel HinojosaAugust 29, 20071.1

Description

1.0 Initially created by Xong per Oziels request.

1.1 Revisions

* Modifying document for use by HCL developers. Per OTM lead request.

* Added event driven principles. * Added additional SOA guidelines

* Added the following standards:

Development, Coding Conventions, Exception Management, Code Management, Code Commenting, Code Review, Unit Test, System Integration Testing, XML, and JMS Settings.

sign off review

ResourceReviewerDateVersionSignature

Functional Review

Technical Review

QA Manager

Developer

Project Manager(s)

Distribution

OrganizationNameDateVersion

TABLE OF CONTENTS

2Version Description

4sign off review

71introduction

71.1Purpose of the Document

71.2Intended Audience

71.3Related Documentation

92Architecture Principles

92.1Service Oriented Architecture

92.1.1Service Modularity and Loose Coupling

102.1.2Service Security

102.1.3Service Granularity

102.1.4Exception Handling in Services

112.2Event Driven Architecture

123Guidelines

123.1Layers of Services

123.2Services are Discoverable and Dynamically Bound

133.3Service have Network-Addressable Interfaces

133.4Service Exposure

133.5Use of SOA Patterns

133.6Security Approaches

143.7JCAPS Environments

154standards

154.1JCAPS Project Structure

154.1.1Non-ESB Project Structure

164.1.2ESB Project Structure

174.2Canonical Message Envelope

194.3Common Services Framework (CSF) Customization

204.4Development Standards

204.4.1EAI Team Development Process

214.4.2eGate Development

214.4.3eInsight Development

214.5Coding Conventions

214.5.1JCAPS

214.5.2Java Code

224.6Exception Management Standards

224.6.1Exception Levels

244.6.2Exception Notification Channels

244.6.2.1Alerts Agent

244.6.2.2CSF

254.6.3Logging Guidelines

254.7Code Management Standards

254.7.1Version Control

264.7.2Migration Process

264.7.3Backup Process

274.8Code Commenting Standards

284.9Code Review Standards

284.10Unit Testing Standards

294.11System Integration Testing Standards

294.12XML Standards

294.13JMS Settings

304.13.1Message Properties

304.13.2Client Properties

314.13.3IQ Manager Runtime Configurations

324.13.3.1Stable Storage

324.13.3.2Journaling and Expiration

334.13.3.3Throttling Properties

334.13.3.4Special FIFO Mode Properties

334.13.3.5Time Dependencies

344.13.3.6Security

344.13.3.7Diagnostics Page

354.13.3.8Miscellaneous Page

354.13.3.9Backing Up Topics and Queues

1 introduction

This document outlines the architecture principles, guidelines and standards to be followed in future service oriented architecture (SOA) designs at Eagle Global Logistics (EGL) using Sun Microsystems Java Composite Application Suite (JCAPS) as the development and integration framework. This set of principles, guidelines and standards are based on the SOA best practices provided by Sun and are customized for EGL to accommodate its system and network infrastructure, organizational structure and business characteristics.

1.1 Purpose of the Document

The purpose of this document is to serve as a starting place for SOA architects and developers at EGL. By familiarizing with the various design principles covered in this document and following the guidelines and standards listed, an architect can design services and systems that help shape and drive the SOA strategy at the enterprise level. Developers can use this document to make sure that the services they develop comply with the enterprise guidelines.

1.2 Intended Audience

The document is designed to guide EGL architects and developers as they work on future SOA projects using JCAPS. The reader is advised to familiarize themselves with the principles, guidelines and standards described in this document and apply these to their design of JCAPS applications following SOA. 1.3 Related Documentation

Throughout this document, the reader is referred to the Implementation Guidelines documents found in RQ 3.1 and other documents for a more comprehensive discussion of the principles, guidelines and standards covered in this document. For details on the Java CAPS suite of products, refer to the User Guide of each product.List 1.0 Document Location

Document CategoryLocationAnnotation

RQ - Sun Microsystems Repeatable Quality Integration Process\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\RQ\SOA RQ3.1\home.htm Open the home.htm page to view the RQ via a web based tool.

\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\RQ\SOA RQ3.1\BestPractices Best Practices documents

\\eglsfps01\PM Office\\ITProjects\100798 - See Beyond\SUN CAPS\RQ\SOA RQ3.1\flash\implementation_guidelines\IG SOA Strategy.pdf Base SOA architecture document.

EAI Team Development Process\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\EAITeamArchitecture.vsd Open the EAIDevelopmentProcess.vsd file to view the development process flow.

TBC - Environment Security Document Mark Collins to develop

TBC - Production Deployment Manual

\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\EAITeamArchitecture \EGL Docs - Integration with JCAPS - Naming Conventions.doc

TBC Alert configurations Mark Collins to develop

Sun Fastrack Best Practices\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\EAITeamArchitecture\Mentoring\Java CAPS Core Components Security.ppt

2 Architecture PrinciplesFor business and IT decision makers, technical solutions must support event based messaging between applications and application tiers using services based integration, and industry standards based deliverables. By establishing these architectures, EGL will provide its customers, business users, and organization the most flexible, scalable, integrated architecture possible.

2.1 Service Oriented ArchitectureTo have a service oriented architecture (SOA) implies to pursue business and technical strategies which promote the exposure of business functionality within and between enterprises in a consistent, published, secure and contractual way. The related document, IG SOA Strategy, describes and explains 17 SOA principles as follows:I. Modularity

II. Interoperability

III. Loose coupling

IV. Support of orchestration and composabilityV. Discoverability and dynamic binding

VI. Location transparency

VII. Security

VIII. Self Healing

IX. Versioning

X. Lease

XI. Network-addressability

XII. Coarse-grained interface

XIII. Metering

XIV. Monitoring and Control

XV. Exception handling

XVI. Separation of interface from implementation

XVII. Published quality of serviceMany of these characteristic are automatically provided when using JCAPS as the development platform. Others are design decisions which the architect must be aware of. This section of the document reviews some of these SOA characteristic that are most relevant to EGLs business environment. Designing services that possess these characteristics therefore becomes EGLs guiding principles for SOA architects.2.1.1 Service Modularity and Loose CouplingTwo of the most important aspects of SOA are modularity and loose coupling. With SOA, an application can be decomposed to a set of smaller modules each responsible for a single, distinct function within the application. On the other hand, a modular service can be used in multiple applications to fulfill the same functionality. When a service is modularized it directly maps to distinct problem domain function, is easy to understand and has limited and predictable impact to the other services. Service modularity should be the first and foremost designing principle in EGLs SOA effort.

In addition to modularity, services should be as loosely coupled as possible. In a loosely coupled system, a service consumer has only a few well known dependencies on the services it consumes, removing it from the need to know unnecessary details of the service providers.

In the JCAPS environment, modularity and loose coupling mean proper project structure and well-thought component location. For example, common components should be placed in common project folders and reference by name should be used to avoid artificial project dependency.

2.1.2 Service Security

EGL has made it one of its designing principles to secure both the JCAPS framework and the applications developed with the framework. This includes centralizing user authentication using the enterprises LDAP server, limiting access to JCAPS components using access control lists (ACLs), and securing all JCAPS-developed applications including web services and web applications. It is also the EAI teams intension to enable single-sign-on of all web applications developed using JCAPS and to secure all web services using Sun Java System Access Manager. Java CAPS Core Components Security.ppt provides a high-level overview of the security features in JCAPS development.2.1.3 Service GranularityThe granularity of service is a crucial design decision. If service granularity is not properly predicted, service consumers could have access to more functionality than they actually need. This can result in many problems, such as security concern, network traffic, etc. The EAI team has decided to take a layered approach. Services will be created at different levels with different granularities. When necessary, a coarse-grained service may be created using multiple fine-grained services. This approach adds some complexity to the system of services, but it allows better control on service security, performance, and functionality.2.1.4 Exception Handling in ServicesIt should be of high priority to consider exception handling in services. Exception handling in services promotes self healing of services and helps minimize the impact of a failed service. It is expected that all services take advantage of the Alerting, Logging and Error Handling common services. This set of common services are provided by the Sun SeeBeyond Common Service Framework (CSF) and will be customized to suite EGLs exception handling strategies. In addition, exception handling should be a required section in any service design document and developers should make it a standard practice to review error handling logic of the services they develop.2.2 Event Driven ArchitectureAn event driven architecture is an approach to system integration that calls for software applications and hardware devices to produce, consume, detect, and react to events. Thus producing greater responsiveness in a system and allowing applications to react intelligently to changes in conditions. An event is a significant change in state. For example, when an item is picked up by EGL from the shippers facility, the item state changes from ready to pickup to pickup. An EGL system may treat this state change as an event to be produced, consumed, detected, or reacted to.This is architectural approach is to be used when designing applications that need to perform activities in real-time, activate long running services, in long running asynchronous business processes, spawn multiple processes or multiple services, and in the building of a messaging system.Three processing styles of EDA exist. The three styles are Simple, Event Stream, and Complex. Simple event processingSimple event processing concerns events that are directly related to specific, measurable changes of condition. In simple event processing, a notable event happens which initiates downstream action(s). Simple event processing is commonly used to drive the real-time flow of work, thereby reducing lag time and cost. For example, simple events can be created by a sensor detecting changes in tire pressures or ambient temperature.

Event Stream ProcessingIn Event Stream Processing (ESP), both ordinary and notable events happen. Ordinary events (orders, RFID transmissions) are screened for notability and streamed to information subscribers. Stream event processing is commonly used to drive the real-time flow of information in and around the enterprise, which enables in-time decision making.2Complex event processingComplex event processing (CEP) allows patterns of simple and ordinary events to be considered to infer that a complex event has occurred. Complex event processing evaluates a confluence of events and then takes action. The events (notable or ordinary) may cross event types and occur over a long period of time. The event correlation may be casual, temporal, or spatial. CEP requires the employment of sophisticated event interpreters, event pattern definition and matching, and correlation techniques. CEP is commonly used to detect and respond to business anomalies, threats, and opportunities. 2Principles

Responsive

XML Standards Based

Canonical3 GuidelinesThis section lists some guidelines for architects on how to follow some of the principles including those described in sections two. 3.1 Layers of Services

At EGL, a layered approach is taken so that services are created with the proper granularity. From a high level, there are two categories of services: business services and technical services. Technical services are self-contained modules that provide a low level functionality where business services are relatively coarse-grained modules fulfilling a business function. In each category, it is possible that multiple layers of services exist such that some higher-level services group multiple fine-grained services to provide a service with coarse-grained interfaces for the purpose of finer control of functionality and security. In general, the following guidelines can be used when creating service in each category:1. Use JCDs for low level services, especially those in the technical service group. On the other hand, use business processes only for high level services.

2. When (and only when) it is not possible to predict the service consumer requirements, take a layered approach: create multiple fine-grained services and group them in layers to provide coarse-grained service. Keep in mind that such service composition does have an impact on service performance.

3. When creating a higher-level service using a group of fine-grained services, design the composite service in such a way that it requires as little knowledge of the component services as possible. Leave the binding of any service to until runtime so that any component service can be replaced without affecting the composite service.3.2 Services are Discoverable and Dynamically BoundServices should be published to a registry so that service consumers can dynamically locate and bind themselves to the service. This will promote a more loosely coupled environment and allow service consumers to select the most appropriate service at runt-time. 1. Business processes shall publish there WSDL files to the UDDI registry.

2. Composed services shall publish there WSDL files to the UDDI registry.a. Finer grained services that act on technical requests shall be statically created. E.g. OTDs.3. Service consumer shall locate service providers through the registry.

4. Service consumers shall not be aware of the service provider location.

5. Service consumers shall dynamically create the requests.6. Service consumers shall dynamically bind to responses.

3.3 Service have Network-Addressable InterfacesA service must have a network-addressable interface. A consumer must be able to invoke a service across the network. The location of the service shall be location independent.1. JCAPS supports multiple communication protocols.

2. Exception to location independent services is between the ESB JMS components.3.4 Service Exposure

In order to promote service reusability, the same service can be exposed via multiple protocols. The additional exposure often takes the form of an interface adaptor. However, some services are wrapped into a JCD providing an easy way for the service to be invoked in a business process. Different options exist for exposing the same service via multiple protocols:

1. Develop the service and expose it as a JMS service. Create another JCD or business process with a web service interface that invokes the first service.

2. Develop the service and expose it as a web service. Add another service (adaptor) that is exposed via a JMS interface but invokes the first service.

When both options are available, the first option has the added advantage of higher performance when the service is invoked by a JMS service consumer.

3.5 Use of SOA Patterns

When properly used, SOA patterns help promote reuse of code and design ideas, facilitate communication and simplify code maintenance. The EAI team has decided to take these advantages of SOA patterns and has started compiling a library of SOA patterns as references. It is also expected that the team will make it a standard practice to identify new patterns and add them to the library together with optimal solutions. The following guidelines can be used by architects and developers when working with SOA patterns:1. Familiarization with existing patterns to understand the patterns that are already built into JCAPS.

2. Whenever wonder how to do certain things in developing a service, ask whether someone else has asked a similar question in different a project. In other words, ask whether the problem at hand falls into a known pattern and therefore already has a known solution.

3. Use the Sun CSF projects as examples.

4. Communicate to the entire team when a pattern is identified and an optimal solution adopted.3.6 Security Approaches

The following approaches will be taken to secure the JCAPS systems and the applications developed with it:

1. Core components: refer to the environment security document on approaches to core component security.2. Web services: web service will be secured with the following features supported by JCAPS:

a. User authentication

b. WS security

c. Transport layer security: SSL

3. Use of Sun Java System Access Manager: the AM will be used for single sign on for web applications and for additional security features for web services. It should be noted that additional custom development is required for these features.3.7 JCAPS Environments

Every developer should maintain their own JCAPS environment in eDesigner. When the following guidelines are followed, developers will be able to unit test their development with minimum interference:

1. Each developer maintains a JCAPS environment in eDesigner. Within this environment, a logicalhost should be mapped to a runtime logicalhost domain which is also maintained by the developer. The advantage of each developer maintaining his/her own logicalhost is that only the services of his/her interest need to be deployed to the logicalhost domain so that it can be run much more efficiently than if all developers share the same logicalhost for unit test.

2. If hardware resource availability allows, each developer should install and control their own Enterprise Manager and use it to manage his/her own logicalhost. The advantage, again, is easy of control and minimum interference.3. The development test environment is coordinated by the team lead or the JCAPS administrator. A developer deploys his/her service to the test environment when and only when the service has passed unit testing. The end result is that the development test environment only hosts components that are in a working state such that continuous system integration testing can be conducted.

4. The QA test environment and the production environment should be strictly controlled by the JCAPS administrator who is solely responsible for any code promotion and service deployment into these environments. However, developers should provide the administrator detailed environment configuration information as well as service implementation notes including JCAPS project dependencies in order to help the administrator carry out these tasks. Refer to the production deployment procedure manual for additional information.4 standardsThis section covers some of the standards to be followed in the design, implementation, and testing of future JCPAS projects at EGL. 4.1 JCAPS Project Structure

The EAI team has established a JCAPS project folder structure to facilitate service modularity and decoupling. Projects for non-ESB services are grouped by application, such that each application contains projects of business services and technical services. This grouping helps keep the project tree from growing too wide. It does not necessarily create a project dependency.Projects for ESB services are grouped individually, such that there is no demarcation of business services and technical services. This helps limit the deployment complexities. Naming conventions will help distinguish between the types of services.Common components used across the enterprise will have project folders named Common. For example the OTDs representing the canonical message structure will be placed under a project named prjEnterpriseCommon. Another example of common objects is the CSF libraries.

Within this common project all OTDs composing the canonical message structures are created and stored under another project folder named OTD. Projects that depend on these OTDs will reference the files versus copying the OTD files to the local project structure. This approach creates a tightly coupled dependency between the enterprise objects and the more specialized services. This tight coupling will enforce that one of two things occur. Either the specialized services upgrade when there is a change to the enterprise objects and maintain backward compatibility. Or the specialized services employ versioning as to use older versions of the enterprise objects. Special case must be taken in the later option. 4.1.1 Non-ESB Project StructureNon-ESB services can be defined as components that provide application services that are either specific to the application or is an enterprise business service which is embedded within the application .

Below is a sample project, prjClassMX, structure for non-ESB services.

4.1.2 ESB Project Structure

An ESB services can be defined as a component that provides an enterprise business service which is external to any legacy application. The service can be business or technical in nature.Below is a sample project, prjTranslateShipmentToEDI304, structure for an ESB service.

The above project forms a technical service. This technical service will translate the enterprise shipment event structure into an EDI document. This technical service would be aggregated into a higher level service known File Shipping Instructions. The objects in the project would be independent from any other projects exception for the common enterprise projects.4.2 Canonical Message EnvelopeThe EAI team has designed an enterprise wide canonical message structure for SOA services. As shown below, this message envelop contains many useful field that helps facilitate message tracking, performance logging, and status reporting. The message envelope also alleviates the need to parse the payload message on each service consumer, allowing for better exception reporting when an un-marshalling exception is encountered with the data in the data area element. This technique also allows for improved performance on components that perform simple pass through operations, such as routing and filtering components.The standard way for a service to use this structure is as follows:

1. Use the DataArea node for message payload and the ApplicationArea node for message enveloping.

2. An initiating service, the service that is the first in a transaction to use this message envelope, sets the source information in the message envelop. It also sets information in the message header.

3. Each service, after receiving a message with the canonical message envelop, should not alter the envelope in any way other than adding an instance to the Service node and setting the overall status (the Status node outside the Service node) on return. This is detailed below.4. When processing a message, a service should add one and only one instance to the Service node, setting the service name and host name (where the service is running from). The start time and the stop time should be set at appropriate times to record the time spent processing the message.

5. One or more statuses can be reported for each service. If statuses for multiple steps are to be reported, they should be added to the Service/Status node in the order the steps are taken.6. On return, the overall status should be set. This status is checked by the service consumer to indicate the processing result of the service.

7. Service request and response data are passed via the DataArea node as message payload.

4.3 Common Services Framework (CSF) CustomizationTo facilitate consistent use of CSF, the EAI team has developed a CSF client package that customizes the CSF client API provided by Sun.

The customizations include:

1. Standardization on a set of ALE codes and publication of a guideline on the use of these codes.

2. Selection of the appropriate CSF Client APIs or convenience APIs to use.

3. Implementation of a set of customized ALE APIs for use within JCAPS and by stand-alone Java clients. This layer customizes the ALE reporting logic in a centralized module so that all services use the ALE services in a consistent way.

Each developer will only use the customized APIs as described in step 3 above. The end result for the developers is a set of much simpler APIs to use and consistent Alert, Logging, and Error (ALE) reporting across the enterprise.The values used to report exceptions are described in the Error Handling Standards section of this document.

The CSFLogger class has been created to simplify the interface to the CSF API. It handles CSF related object creation and wraps a subset of common ALE methods. Calls to public constructor and methods may throw a CSFException and should be thrown up the stack to the core JCAPS classes allow a request to be automatically be rolled back As a guideline to our Exception handling practices when an Exception bubbles up to the JCD trigger method (e.g., receive()), it is interpreted as an action to rollback the current JMS Message. To start using it, you need to add the jar from the Repository (located at prjEGLCommon\lib\eaiUtil.jar) to the classpath of your JCD. For more detailed information about this Class please follow the hyperlink at the beginning of this paragraph.4.4 Development Standards

4.4.1 EAI Team Development ProcessThe eAI development process is outlined in the following Visio document, EAI Development Process. The process outlined in the EAI Developer Responsibility band shows the activities that an EAI developer must execute in order to develop integration components. The development process starts with high level design using UML and ends with unit testing. Each activity in the Responsibility band also lists the deliverables an EAI developer must produce before exiting an activity. Below the name of the activity is a reference to the deliverable. The deliverable themselves are referenced in the Reference Documentation band. The document icons in the Reference Documentation band have links to the actual deliverables. In most cases a deliverable is document template a developer must fill-in. The other bands are there to show dependencies on other activities that should be performed by a different team, best practices work shops an EAI developer must be familiar with before developing integration components, and prerequisite items an EAI developer must have in order to gain access to the development environment. Below is a list of the additional bands and their purpose. 1. Pre-Requisite

a. Communicate that an individual wishing to use JCAPS for development must have had formal training, familiar with SOA, familiar with RQ, have read this document, under stand the environments onto which JCAPS is installed, familiar with integration patters, and business process modeling.

2. Other team member responsibilities

a. List the activities and artifacts that other team members must produce for an EAI developer.

i. An EAI developer can be that other team member, but the activities listed in this band are not the focus of EAI Development.

3. Mentoring

a. Best practices workshops a team member must have attended or be familiar with the guidelines in the workshop.For large enterprise wide projects follow the RQ Methodology, at ones discretion. Documents of importance to EAI team members are the Component Specification , Composite Application Component Architecture, Composite Application Deployment Architecture, and SOA System Architecture.4.4.2 eGate Development

Follow the link provided below to familiarize yourself with Suns best practices for e*Gate development.BP eGate Development4.4.3 eInsight Development

At the time of this writing eInsight development was in its infancy at EGL. Because of this no development standards are available. Sun has provided best practices for developing eInsight solutions. Please refer to the following documents for additional information; eInsight_UG.pdf and CAPS_Deployment_Guide.pdfNote; using correlation in eInsight components removes the ability to scale and provide failover. Careful consideration should be taken before using correlation. 4.5 Coding Conventions

4.5.1 JCAPSThe EAI team has published a naming standard guide based on Suns best practices. This guide, EGL Docs - Integration with JCAPS - Naming Conventions, is to be followed by developers when adding components to the JCAPS repository.

4.5.2 Java CodeThe EAI team will use Sun Microsystems a coding convention guidelines. This guide, conventions used on the Java Programming Language, is to be followed by developers when constructing integration components.

4.6 Exception Management Standards

The following sections will outline the standards for exception management to be used by JCAPS developers. As a general rule a developer must familiarize themselves with Suns best practices on exception management by clicking on the following link, BP Error Handling Guidelines. As this document forms the basis for the exception management standards listed herein. Also refer to Suns Java standards on throwing and catching exceptions within Java code. Business processes, business rules in particular, should also follow the standards listed here. This document does not cover compensation activities since it deals with rolling back to a previous state and not handling and reporting errors within components.The section provides information on the following topics, exception levels and exception notification channels. There aim is to provide a comprehensive set of standards for developer to follow when developing JCAPS components and integrating with the CSF.A few rules before proceeding. Rule, all exceptions will be reported to the CSF.Rule, no interface is to be turned off with out operations manual involvement. No code shall be developed to shut down a component that does not require user validation first. Rule, the CSF is an integral part of a component. All interactions with the CSF must complete successfully, if not the component shall roll back the transaction.

Rule, do not create infinite loops. Always remove messages that do not comply with the message structure, i.e. Un-marshal exceptions.Rule, always catch finer grained exceptions first and deal with them. Do not have a single catch all exception clause. Having a single catch will lead to termination.Rule, if a system exception is thrown by the CSF, operations must be alerted and the issued resolved immediately.

Rule, proactively take measures to prevent system failures. For example, if you expect a rows from a query, then check to see that rows exits before executing the next method or raise a business rule exception.Rule, do not handle unchecked exceptions as they are bugs in the code. Report them to the CSF and re-throw the exception.4.6.1 Exception LevelsBelow are the exception levels to be used by JCAPS developers. These levels are to be used to categorize the exceptions that can be thrown within JCAPS components. Each exception will have an error code associated to it and each error code will be assigned an exception level.

For a list of error codes and descriptions follow the following link #########.Exception Levels

FATAL

Exceptions with this level represent un-recoverable system failures. The majority of the exceptions will be thrown through system level mechanisms such as alerts. Types of system failures are out of memory, shutdown requests on components, can not start, message to large for queue, etc. This type of level requires immediate attention and resolution. CRITICAL Exceptions with this level represent either program bugs or requests that can not be processed because of an invalid message structure or invalid types. Service not available exceptions also fall into this level when the client receives timeouts, incorrect security credentials, login information, I/O, etc. Business rules validations and failed transformations do not fall into this level.

This type of level requires immediate attention and resolution. The unit of work that the component was working on at the time the exception occurred must be stopped, removed, and made available for re-processing. A new unit work must commence. ERROR Exceptions with this level represent business rule violations and transformation failures. The interface design will determine if the unit of work should terminate entirely or just report the error and continue processing the remaining lines in the unit. If multiple business violations are encountered within a unit work then report all violations as a single transaction (???). Messages can be made available for re-processing. But typically they are not. WARNING Exceptions with this level are used to warn that an exceptional condition could occur because of system resources starting to deteriorate or components have identified that a business policy could be validated. Examples of these are queues reaching there thresholds, memory reaching a predetermined limit, requests for services not occurring in a timely manner, etc.4.6.1.1.1 Error Codes GroupsError codes will be classified into groups to provide better error reporting and recognition.

Database Programming Language Software System

Hardware System

Business Rule Violation Message Format

Request Format Unknown

New java programming errors that were not accounted for.

Unknown errors should be researched and classified in a timely manner so that the system can be configured to properly report the error.4.6.2 Exception Notification Channels

This section outlines the means that will be used to communicate exception information. It will also outline the required information to be communicated for exception reporting.At stated earlier in this document the Common Services Framework is to be employed for alerting, logging, and error reporting (ALE) functions. This is where all exceptions will be recorded, providing a centralized area for error mining, escalation procedures, and resolution details. Along with the CSF, JCAPS developers will employ the alert agent to capture and communicate severe system conditions to the CSF and system monitoring tools via SNMP. 4.6.2.1 Alerts Agent Alert agent will be enabled. All JCAPS projects will be monitored via the alert agent for system level notifications. Notifications that require resolution or indicate a severe exceptional condition will be made available to the CSF and system monitoring tools.

The alerts will be sent to the CSF via JMS and SNMP to system monitoring tools.

Predefined alert levels that are to be monitored and reported are listed below. For specific alerts and what actions need to be taken see the system adminstrator. Fatal All alerts with a level of FATAL will be transmitted to the CSF and system monitoring tools. Critical All alerts with a level of CRITCAL will be transmitted to the CSF and system monitoring tools Major

Minor

Warning

4.6.2.2 CSF

This section outlines the information that is to be capture for CSF reporting. Error Codes (Configured)

All exceptions that are thrown will have an associated custom error code. All alerts that are raised will also have a custom error code.

The Message Identification should be populated by the transaction identifier.

Application information should be populated from the application area of a transaction.

The payload of a transaction needs to be captured.

4.6.3 Logging GuidelinesAll consumers will log into the CSF the time it receives a transaction. This logging is independent of the business logic; i.e. forming its own unit of work.4.7 Code Management Standards

4.7.1 Version Control

The Sun Java Composite Applications Platform Suite provides developers with tools to track different versions of their application's components. A release engineer may then take a snapshot of the workspace that includes a version of each component, and then can use that snapshot to create a release for deployment.

The JCAPS versioning system is based on CVS, and as such offers similar functionality. Components within a project or an environment may be checked out, ensuring that only the developer who checked the component out can make changes. These changes may then be committed by checking the component in, or discarded by undoing the checkout. When components are checked in, the developer may provide a description of the change. A history of these change descriptions can be retrieved on any versioned component. A much more detailed discussion of the version features is described of the eGate Integrator User's Guide.

The versioning system supports tagging of components. This is simply a way of identifying a version of each component with a common name, which is useful when identifying which versions to include in a release. In the typical environments, releases will be tagged using the following format:

M-m-p-stage

Where:

M - The major version number. This should be incremented when a release contains significant new features or functionality.

m - The minor version number. This should be incremented when a release contains bug fixes and less significant new functionality.

p - The patch number. The patch number is incremented when changes are made to a release that has been deployed to the production environment.

stage - The release stage. This is one of alpha, beta etc.,.

All development is done on the trunk. This is labeled as HEAD in the JCAPS repository. Developers make changes to the project components until the application is ready for deployment to the alpha environment. At this time, all components are checked in and the release engineer tags the current versions of all components with the M-m-0-alpha tag, where M and m correspond to the major and minor release numbers of the release being built. The release engineer then builds the applications and deploys it to the alpha environment for testing. When all changes needed to fix bugs found in alpha are checked in, the release engineer will tag the current version of each component with the M-m-0-beta tag. Again, the application is built and deployed into the beta environment. As with alpha, changes are made to fix any bugs found during beta testing, and are checked in. The release engineer then tags the components with the M-m-0-fcs tag and builds the application and deploys it into production.Once an application has been deployed into production, development begins on the next release. If a change is needed for a release that is already in production, then a branch must be created. The branch will be named after the release that it corresponds to (eg. 1-0). Changes for the patch are checked into the branch and are tagged with alpha, beta tags as is done for a regular release. If developers would like to make changes for a future release, a prototyping branch will be created by the release engineer. Changes for the future release are checked into the prototype branch until that release becomes the current release. At that point, the versions on the prototype branch will be merged into the main trunk and regular release versioning will be used.NOTE: An exported project does not retain its versioning information, so do not export a project if you require versioning information in the repository to which it is restored.

4.7.2 Migration Process

Build Procedures

During development, developers will build the project as needed with the Enterprise Designer tool.

For Alpha, Beta and Production releases, a release engineer will build the project using the versions of the components labeled with the appropriate release tag. The release engineer may build the project via CommandLineCodegen tool or Enterprise Designer build functionality.

A package is then created for the application that includes the ear file, an export of the project, and enough information about the development environment such that it could be recreated, if necessary. Export and Import

Projects may be migrated from one repository to another by exporting from one and importing into the other. Since exported projects do not retain their versioning information, however, it is recommended that projects only be migrated when past versioning information is not needed. For example, migrating a project from the Sandbox [used for experimentation] to the Development repository in order to build upon a prototype would be a suitable migration path. Migrating a CSF project from the CSF repository to the Development repository would also be suitable if the CSF project will only be used by a development project, and not further developed.

4.7.3 Backup Process

Repository Back-up and Restoration

The Development environment's repository may be backed up daily via a cron scheduled backup script executed between and . [Or]A backup of the repository may be restored manually by executing the restore script bundled with the JCAPS repository.

Project Back-up

Individual projects in the Development environment's repository may be backed up via a cron scheduled backup script executed between and . A configuration file will be maintained that lists the projects that are to be exported from the repository for backup.A backup of a project may be manually restored by executing the project import script that is bundled with the JCAPS repository.

Provided by HCL.

4.8 Code Commenting Standards

Software documentation exists in two forms, external and internal. External documentation is maintained outside of the source code, such as specifications, help files and design documents. Internal documentation is composed of comments that developers write within the source code at development time.

Following are the recommended commenting techniques:

When modifying code, always keep the commenting up to date.

At the beginning of every routine, it is helpful to provide standard, boilerplate comments, indicating the routines purpose, assumptions, and limitations. A boilerplate comment should be a brief instruction to understand why the routine exists and wait it can do.

Avoid adding comments at the end of line of code; end-line comments make code more difficult to read .However, end-line comments is appropriate when annotating variable declarations. In this case, align all end-line comments at the common tab stop.

Avoid using the clutter comments, such as an entire line of asterisks. Instead use white space to separate comments from code.

Avoid surrounding a block comment with a typographical frame. It may look attractive, but it is difficult to maintain.

Prior to deployment, remove all temporary or extraneous comments to avoid confusion during future maintenance work.

If you need comments to explain a complex section of code, examine the code to determine if you should rewrite it.

Use complete sentences when writing comments.

Comment as you code, because most likely there wont be time to do it later.

Avoid the use of superfluous or inappropriate comments, such as humorous sidebar remarks.

Use comments to explain the intent of the code.

To prevent recurring problems, always use comments on bug fixes and work-around code, especially in a team environment.

Use comments on code that consists of loop and logic branches.

Throughout the application, construct comments using a uniform style, with consistent punctuation and structure.

Provided by HCL.4.9 Code Review Standards

Please refer the Code review template PSPF018-PR Code Review template

Provided by HCL.4.10 Unit Testing Standards

Test plan covers unit testing, integration testing, load testing and production testing. It is developers job to fill out the unit testing section. The rest of them should be filled out by a test engineer. Test plan should be constantly updated as a project moves into the validation phase.

Besides test cases, a test plan should also document the entire procedure of setting up the test environment and conducting the test. Each test case should include following information: a reference number, use case reference number, input and output.

Tip: Test data should cover all possible scenarios, not just the Easy path.

Automated test cases are developed during the implementation phase of the project. They include both unit tests, end-to-end integration tests and load/performance tests. Furthermore, for each bug that is fixed in the system, a test case must be created to expose the bug, and then test for the fix.

As the name implies, all tests must be automated and test data retrieval should be made automated also. A person should be able to execute all test cases with a single mouse click.

For each Seebeyond business process, create a test business process for it. For each Seebeyond deployment, there must be a corresponding test deployment. Use a JMS receiver at the top level to trigger all tests. Keep the test cases updated. It may seem time consuming to use automated test cases, but they are extremely useful. Once the testing infrastructure is in place, it is easy to add more test cases and execute them. We use automated test cases to validate and certify a system.

Please refer the template for ENGT559-Unit_Test_Case_Identification

Provided by HCL.4.11 System Integration Testing Standards

Please refer the template for SIT plan ENGT560-System_Integration_Test_Plan

Provided by HCL.4.12 XML Standards

Please refer the XML standards/Guidelines ENGG010-XMLGuidelines

Provided by HCL.4.13 JMS Settings This section outlines the relevant configuration settings available to JMS message servers and JMS clients and the recommend standard settings. The standard settings provided herein are based from Sun recommendations. JCAPS architects and developers must size the components capacity requirements and adjust the settings to meet the processing needs of the solutions they will be providing. As most of the settings are design time decisions and runtime decision.

For additional information on how to best configure a JMS provider click on the following links, Java CAPS Best Practices Workshop_Performance.ppt and Java CAPS Best Practices Workshop_Scalability and HA.ppt.As a standard the redirect and redelivery options must always be employed. The recommend values are listed below in the Client Properties section.JMS clients provide the local messaging protocols, such as message persistence and

delivery semantics, for messages being propagated between Project components.

Together with the JMS message server, they constitute a JMS provider.

JMS clients are of two basic types: producers and consumers (or a combination of both). If

associated with a queue, these become queue senders and receivers, respectively. If

associated with a topic, they become topic publishers and subscribers, respectively.4.13.1 Message Properties The default message properties should be used. The exception is when using the request/reply features of JMS. Settings for request/reply are design time decisions. At this time it is not recommended to use the request/reply model for JMS. If synchronous behavior is to be desired approach then a web service call needs to be employed. All properties at the message level are design time decisions. CategoryPropertySettingAnnotations

HeaderCorrelationID

CorrelationIDAsBytes

DeliveryMode

Destination

Expiration

MessageID

Priority

Redelivered

ReplyTo

Timestamp

Type

Additional OutbounddeliveryMode

priority

timeToLive

destination

MessageServerURL

4.13.2 Client Properties

Client properties are either applicable to a consumer or producer, but not both. With the exception of the redirect and redelivery properties all other client properties are design time decisions. CategoryPropertySettingAnnotations

RootDurable Subscriber

NameNaming Convention:

Applies only to Topic subscribers

BasicConcurrencyService Oriented Integration : Connection Consumer Asynchronous processing onlyThe Connection consumer setting is a recommendation.

For queues, it is also possible to use connection consumer for concurrent processing on

multiple CPUs (and application servers) on a system.Used with the IQ Managers delivery mode settings.The eGate

Integrator JMS implementation enables you to configure topic subscribers as

connection consumers to improve message throughput through concurrent processing

Information Oriented Integration: Design Time Decision

Business Process Oriented Integration: Design Time Decision

Delivery mode

Idle timeout

Maximum pool size

Maximum wait time

Message selector

Priority

Steady pool size

Transaction mode

RedeliveryDelay2:1000 ; 3:3000 ; 5:move($_DeadLetter)

The Progressive delay setting should be set a the JMS manager level. ???You can specify the actions you want to be taken after a message has been rolled back

by appending a redelivery-handling string to the message server URL when you

configure the JMS IQ Manager. These actions then override the default actions for all

JMS clients interacting with the JMS IQ Manager.

Move/Delete After N

Times5

Actionmove

Move to Queue/

TopicAuto

Move to Destination

Name$_DeadLetter

AdvanceDurability

Applies only to Topic subscribers

Server session batch

size1

Server session pool

size

4.13.3 IQ Manager Runtime ConfigurationsThe IQ manager properties listed below are runtime configurations for the JMS manager residing within an integration server. Properties include, message delivery order, tuning configurations, journaling options, and diagnostic options. Some properties work in conjunction with client properties and affect the behavior of the consumers and producers. For detailed information about the properties listed below see the eGate JMS Reference guide.Please note that modifying some of the properties will require that the integration server be stopped and re-started in order for the new values to take effect. Some changes may require that dependent objects or underlying objects be deleted and recreated.4.13.3.1 Stable Storage

The stable storage configurations are used to specify the JMS database location, database size and record size.

It is preferable that the location of a database be on high performance storage device and separate from the software components. CategoryPropertySettingAnnotations

Stable StorageData DirectoryFor now use the default

Block Size

Segment SizeDesign time decisionPlease be aware that if a message is larger than the segment size the message will be discarded. Neither will an error be returned to the JCD. The Alert option must be enabled so that any alert message reporting this issue is trapped and a system administrator notified immediately.

Minimum Number of SegmentsDesign time decision

Maximum Number of SegmentsResource capacity

Sync To Disk

4.13.3.2 Journaling and ExpirationThe stable storage configurations are used to specify the JMS database location, database size and record size.

CategoryPropertySettingAnnotations

Journaling & ExpirationEnable Message ExpirationEnabled

Maximum LifetimeThe default is 30 days. The hardware resources must be capable of achieving this.

Enable Journal

Journaling Maximum Lifetime

Journal Directory

4.13.3.3 Throttling Properties

These properties are used to manage memory and disk resources needed by the IQ Manager. This properties should be set based on system capacity and readjusted if system capacity increase or decreases.

CategoryPropertySettingAnnotations

ThrottlingPer-Destination Throttling Threshold100,000

Server Throttling Threshold100,000*Number of ProducersEnsure that disk space will be sufficient enough to support this requirement.

Throttling Lag25,000

4.13.3.4 Special FIFO Mode Properties

The default values for these properties should be used. They should only be modified if message sequence is key a requirement.CategoryPropertySettingAnnotations

FIFOFully Serialized Queues

Fully Concurrent Queues

FIFO Expiration Time

4.13.3.5 Time Dependencies

These properties are used to group queues and topics together.

CategoryPropertySettingAnnotations

Time DependenciesTime Dependency Topics

Time Dependency Queues

4.13.3.6 Security

Security must be enabled at all times.CategoryPropertySettingAnnotations

SecurityRequire AuthenticationEnabled

Default Realm

Enable File Realm

Enable Sun Java System Directory Server

Enable Microsoft System Directory ServerEnabled

Enable Generic LDAP Server

4.13.3.7 Diagnostics Page

Only log errors and above.CategoryPropertySettingAnnotations

SecurityLogging LevelERROR

Logging Level of Journaler

Maximum Log File Size

Number of Backup Log Files

4.13.3.8 Miscellaneous Page

Alerting is always enabled.

CategoryPropertySettingAnnotations

MiscellaneousEnable AlertEnabled

4.13.3.9 Backing Up Topics and Queues

Backups must be performed daily on JMS Managers.

Wikipedia.com Event Driven Architecture

Brenda M. Michelson, Event-Driven Architecture Overview, Patricia Seybold Group, February 2, 2006

EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards.doc Page 1 of 349/26/2007

1EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards.doc Page 5 of 359/26/2007

_1249817796.doc

ENGT559_20

EAI - Unit Test Case Identification

EAI - Unit Test Case Identification

Version No.:

Date:

Project Name:

Project Code:

Copyright Notice

This document contains proprietary information of HCL Technologies Ltd. No part of this document may be reproduced, stored, copied, or transmitted in any form or by means of electronic, mechanical, photocopying or otherwise, without the express consent of HCL Technologies. This document is intended for internal circulation only and not meant for external distribution.

Revision History

Version No

Date

Prepared by / Modified by

Significant Changes

Glossary

Abbreviation

Description

Table Of Contents

41Project Information

1 Project Information

Client:

Project:

Version:

Project Code:

Test Scenario Name:

Test Scenario ID:

Unit / Integration Test Plan ID:

Assumptions / Dependencies: Existing Customer Details

Interface Name

Connection Model Name

Testing Date

Tested by

Type of Test

F-First Time R-Retest

**Flow/Activity Name

Input Event

Output

Event

Result

Reason If failed

Resolution

Prepared By

Date

Reviewed By

Date

Approved By

Date

**->Flow/Activity not only applicable to each Flow and Activity of connection model but also it is applicable to uni-directional connection model / adapter as a whole.

HCLT Confidential

Page 3 of 5

_1249818004.xlsReview summary

Project CodeNameEffortWork ProductRevn. No.Work Product TypeSizeSize UnitWork product typeUnit

Project NameDocumentPages

Project TypeRose model - Class DiagramClass diagrams

PM NameRose model - Seq. DiagramSeq. diagrams

Phase of ReviewRose model - Act. diagramsActivity diagrams

Review No.CodeNon commented SLOC

AuthorCode - CommentsLines of comments

ChecklistTest planTest cases

Review Date

ObjectiveCompleteness

Correctness

Compliance to standardsPlan Review Effort

(Other)Plan Rework Effort

Issue SeveritySub TotalOpenClosedDeferred

Major0000

Minor0000

Trivial0000

Total0000

Major0000

Minor0000

Trivial0000

Review Effort0.00 PHResolution

Defect Closure EffortReference documentRevn. No.

Verification Effort

Total Effort0.00 PH

Fix verified

Verified By

Verification Date

Sign-off By

Sign-off Date

&L&A&C&"Arial,Bold"&16Offline Defect Log

&L&8PSPF018_21 / &F&C&8HCLT Confidential&R&8Sheet &P of &N

Defect log

StatusPhase introducedCause of defectValiditySeverityTypeClass

OpenPlanningCommunicationValidMajorCodingDefect

ClosedRequirementsEnvironment setupInvalidMinorDataClarify

DeferredHigh level designInadequate standardsTrivialDocumentationInvestigate

Low level designInadequate training / educationEnvironmentalSuggestion

CodeProcessFunctional

TestRequirements changed / not clearInterface

Performance

Standards

S. No.File / Work productIssue LocationDefect DescriptionValiditySeverityPhase IntroducedType of issueClassCause of DefectStatusComments

&L&A&C&"Arial,Bold"&16Offline Defect Log

&L&8O_PSPF018_20 / &F&C&8HCLT Confidential&R&8Sheet &P of &N

_1251800589.pdf

XML Guidelines

Version No. : 2.0

Date: 30-Dec-2005

Corporate Quality Team HCL Technologies Ltd

PM Towers 37, Greams Road

Chennai 600 006.

Copyright Notice

This document contains proprietary information of HCL Technologies Ltd. No part of this document may be reproduced, stored, copied, or transmitted in any form or by means of electronic, mechanical, photocopying or otherwise, without the express consent of HCL Technologies. This document is intended for internal circulation only and not meant for external distribution.

ENGG010_20 XML Guidelines

HCLT Confidential Page 2 of 40

Revision History

Version No. Date Prepared by / Modified by Significant Changes

2.0 30-Dec-2005 CTWG HCL Logo modified.

1.1 07-Jul-2003 Corporate Quality Modified the address of Corporate Quality Team in the first page.

1.0 15-May-2003 Engg. TWG OMS Phase 2 release.

Glossary

Abbreviation Description

HCLT HCL Technologies Ltd.

VER Version

DTD Document Type Definition

XML Extensible Markup Language

XSL Extensible Style Sheet Language

XSLT Extensible Style Sheet Language Template

SAX Simple API for XML

DOM Document Object Model

ENGG010_20 XML Guidelines

HCLT Confidential Page 3 of 40

Table Of Contents

1 Introduction.......................................................................................................................................5

1.1 Purpose ....................................................................................................................................5

1.2 Scope .......................................................................................................................................5

1.3 Intended Audience ...................................................................................................................5

2 Data Description Technology..........................................................................................................5

2.1 XML DTD..................................................................................................................................5

2.1.1 Attributes vs Elements................................................................................................5

2.1.2 Miscellaneous.............................................................................................................6

2.2 XML Schema............................................................................................................................9

2.2.1 Naming Convention....................................................................................................9

2.2.2 Declaration .................................................................................................................9

2.2.3 Namespaces ............................................................................................................10

2.2.4 Versioning Schemas ................................................................................................10

2.2.5 Miscellaneous...........................................................................................................10

3 Processor Technologies Parsers Types ...................................................................................11

3.1 DOM .......................................................................................................................................11

3.1.1 Guidelines.................................................................................................................12

3.2 SAX ........................................................................................................................................12

3.2.1 Guidelines.................................................................................................................12

4 Processor Technology - Framework ............................................................................................13

4.1 .NET .......................................................................................................................................13

4.1.1 GuideLines ...............................................................................................................14

4.2 Java........................................................................................................................................18

4.2.1 Parsers .....................................................................................................................18

4.2.2 API............................................................................................................................18

ENGG010_20 XML Guidelines

HCLT Confidential Page 4 of 40

4.2.3 Serialization..............................................................................................................20

4.2.4 Access Methods .......................................................................................................20

5 Transformation Technologies .......................................................................................................21

5.1 GuideLines .............................................................................................................................21

5.2 XPath......................................................................................................................................30

5.2.1 Guide Lines ..............................................................................................................30

6 Linking technologies......................................................................................................................33

6.1 XPointer..................................................................................................................................33

6.1.1 Guidelines.................................................................................................................33

7 XML Database .................................................................................................................................35

7.1 Data versus Documents.........................................................................................................35

7.1.1 GuideLines ...............................................................................................................35

7.1.2 Example....................................................................................................................35

7.2 Mapping Document Schemas to Database Schemas ...........................................................37

7.2.1 GuideLines ...............................................................................................................37

7.3 Query Languages...................................................................................................................37

7.3.1 GuideLines ...............................................................................................................37

7.4 Native XML Database ............................................................................................................37

7.4.1 GuideLines ...............................................................................................................37

7.5 Storing data from XML documents in traditional databases ..................................................38

7.5.1 GuideLines ...............................................................................................................38

ENGG010_20 XML Guidelines

HCLT Confidential Page 5 of 40

1 Introduction

1.1 Purpose

The purpose of programming standards is to support the development of applications that are consistent and well written. Standards and guidelines help developers to create a code base with a uniform presentation, which leads to code that is easy to understand, easy for other developers to use and easy to maintain. Standards and guidelines also help developers to avoid the common pitfalls of XML that leads to code that is robust, consistent, reliable and portable.

1.2 Scope

Implementation standards for XML are essential and should be adopted to achieve the following goals.

Facilitate joint development

Avoid common pitfalls

Maintainability Reliability

Understandability

Seamless Coexistence of modules

1.3 Intended Audience

Developers and Reviewers in xml.

2 Data Description Technology

2.1 XML DTD

2.1.1 Attributes vs Elements

Guidelines

Put metadata into attribute, put content into element (one way to distinguish between metadata and content is to ask question: "if I remove this data/information, would my

ENGG010_20 XML Guidelines

HCLT Confidential Page 6 of 40

understanding of the content change; if answer is "no", this is rather metadata, i.e. attribute or descriptive information)

Attributes are more suitable for enumerated data. Attributes can be used for computer-manipulated values.

Entities (nodes) are expressed as elements

Properties (edges) and relations are expressed as attributes

Attributes are atomic characteristics of an element/object that have no identity of itself, their meaning may change on element described

2.1.2 Miscellaneous

Guidelines

TagNames

Use whole english words for TagNames(e.g. instead of ). It allows for automated translation (XML is unicode compliant)

Using Container Elements

If two or more different types of elements can appear at the same level in a tree, create 'container' elements

Without this rule, it is common to see documents like:

By following the rule, the document instead looks like:

ENGG010_20 XML Guidelines

HCLT Confidential Page 7 of 40

Avoid 'mixed' content.

If you want mixed content (elements and text), the DTD must be written as:

This DTD is ambiguous. It says that can contain many elements. Also, can have multiple blocks of content:

hi bye goodnight

'mixed' is useful for marking up content and works well with free-form text. However, unless this is exactly what you want, you should not define your document so liberally. Create a new node that will be defined as #PCDATA:

Plan for DTD maintenance

ENGG010_20 XML Guidelines

HCLT Confidential Page 8 of 40

DTDs can be changed once they have been published, as long as certain guidelines are followed:

1. Elements cannot be removed

2. Attributes cannot be removed

3. Attributes cannot be changed from "implied" to "required"

4. Default values should not be modified (generally)

5. A "value" cannot be removed from an attribute "value list"

6. The required structure of a document cannot be changed. For example,? cannot become + and a new element cannot be required to appear inside an existing element. Only? And *can be used when changing the document structure.

7. #PCDATA can't be removed from an element

If these guidelines can't be followed, a new type of document must be created.

Another way to manage change is to plan for it. For example, a top-level element could have a "version" attribute. A document that conforms to the first version of the DTD would have version="1" and a document that conforms to the second version of the DTD would have version="2". The version number would only change if the DTD were modified in such a way that violated the above guidelines. However, without diligent coding, this method will fail. Any code "forgets" to check the version will break when a new version is introduced to the system.

Use entities to encapsulate repetition

As an example, a traveler uses a vehicle to get to his destination:

The DTD fragment for this is:

The VEHICLE entity is handy because it can be reused. It also makes the DTD easier to maintain. For example, if you add another type of vehicle, only the entity needs to be changed. The rest of the DTD is unaffected.

By the way, XML Schemas address this need via two mechanisms: substitution groups and inheritance.

ENGG010_20 XML Guidelines

HCLT Confidential Page 9 of 40

2.2 XML Schema

2.2.1 Naming Convention

Guidelines

Use Upper Carmel Case with no spaces or hyphens between words for all XML element names.

Use Lower Carmel Case with no spaces or hyphens between words for all XML attribute names.

Enumeration values should use names only (not numbers) and the names used for enumeration values must conform to the guidelines for element or attribute names.

Example

Examples of UCC camel case names are: PublisherName or TransactionSequenceNumber.

Examples of LCC camel case names are: attributeTypeId or processId.

2.2.2 Declaration

Guidelines

Type Definition If a type definition is likely to be reused, either a simpleType or a complexType should be defined globally in the namespace instead of defining the type anonymously in the Element declaration.

Global And Local Element Declarations

Global element declarations should be used for elements that will be reused from the target schema as well as from other schema documents. Local elements are to be favored when element declarations only make sense in the context of the declaring type and are unlikely to be reused.

Global And Local Attribute Declarations

Global attribute declarations should be used for types that will be reused from the target schema as well as from other schema documents. Local attributes should be used when attribute declarations only make sense in the context of the declaring type and are unlike to be reused. Since attributes are usually tightly coupled to their parent elements, local attribute declarations are typically favored by schema authors.

Nested Elements

ENGG010_20 XML Guidelines

HCLT Confidential Page 10 of 40

Schemas should use nested elements that use the type attribute or an inline type definition (simpleType or complexType) instead of the ref attribute that references a global element

2.2.3 Namespaces

Guidelines

Default Namespace - targetNamespace or XMLSchema

Make the targetNamespace the default namespace, and explicitly qualify all components from the XMLSchema namespace.

Namespace design When managing multiple schemas, three design patterns apply.

Heterogeneous Namespace Design: Give each schema a different targetNamespace

Homogeneous Namespace Design: Give all schemas the same targetNamespace

Chameleon Namespace Design: Give the main schema a targetNamespace and give no targetNamespace to the supporting schemas (the no-namespace supporting schemas will take-on the targetNamespace of the main schema, just like a Chameleon)

When reusing schemas that someone else created you should those schemas,

(i.e.) use the Heterogeneous Namespace design.

When schemas which contain components that have semantics only in the context of an ing schema, use Chameleon Namespace Design. As a rule of thumb, if your schema just contains type definitions (no element declarations) then that schema is probably a good candidate for being a Chameleon schema. Avoid using Chameleon schemas.

When all of your schemas are conceptually related, use Homogeneous Namespace Design. Use URLs in preference to URNs as Namespace names.

2.2.4 Versioning Schemas

Guidelines

Change the (internal) schema version attribute OR.

Create a schema Version attribute on the root element.

2.2.5 Miscellaneous

Guidelines

ENGG010_20 XML Guidelines

HCLT Confidential Page 11 of 40

All Schemas must have a Consistent Value for elementFormDefault, when including / importing multiple schemas.

Use elementFormDefault=qualified to expose namespaces. When there are multiple elements with the same name but different semantics then you may want to namespace-qualify them so that they can be differentiated.

Use elementFormDefault=unqualified to hide namespaces from instance documents, when simplicity, readability, and understandability of instance documents is of utmost importance

Note that when elementFormDefault=qualified, then every element in the XPath expression should also be qualified.

Use model groups when requirement is just to be able to use a named group of elements. Complex Types to be used when attributes need to be included in the contentModel or type derivation is important.

Avoid using Substitution Groups. Use instead.

Avoid using Notation Declarations. Notations in W3C XML Schema are not compatible with notations in DTDs, because a Schema notation is a QName.

Use extension of Complex Type only if type-aware query languages like XQuery or XSLT 2.0 are able to process the elements and attributes polymorphically.

Avoid using restriction of Complex types.

Use Abstract Types with care. Ensure that further extension or restriction on the type has been applied.

Avoid using type redefinition. A major problem with type redefinition is that unlike type derivation, using the block or final attributes cannot prevent it. Thus any schema can have its types redefined in a pervasive manner, thus altering their semantics completely.

3 Processor Technologies Parsers Types

3.1 DOM

The Document Object Model (DOM) is a platform- and language-neutral interface that permits script to access and update the content, structure, and style of a document. The DOM includes a model for how a standard set of objects representing HTML and Extensible Markup Language (XML) documents are combined, and an interface for accessing and manipulating them. The key advantages of the DOM are the abilities to access everything in the document, to make numerous content updates, and to work with content in separate document fragments. Working together with the Dynamic HTML (DHTML) Object Model available as of Internet Explorer 4.0, the DOM enhances a Web author's ability to build and manage complex documents and data. Many tasks, such as moving an object from one part of the document to another, are highly efficient and easy to perform using DOM members.

ENGG010_20 XML Guidelines

HCLT Confidential Page 12 of 40

3.1.1 Guidelines

Usage of DOM for parsing XML documents is recommended in the following situations:

Performing XSLT transformations

The DOM works better for XSL Transformations (XSLT) where the source XML document is transformed based on the XSLT template applied. To create multiple views of the same data, one must transform it using one of two style sheets. In order for this transformation to take place, two instances of the DOM need to be created. One stores the XML source; the other stores the transformed content.

Complex XPath filtering is required

DOM should be used if one must perform complex XML Path Language (XPath) filtering and retain complex data structures that hold context information. The tree structure of the DOM retains context information automatically. With SAX, one must specifically take care of retaining context information.

Modify and save XML files

The DOM allows creation or modification of a document in memory, as well as read a document from an XML source file. SAX is designed for reading, not writing, XML documents. The DOM is the better choice for modifying an XML document and saving the changed document to memory.

Random access to data

If random access to information is crucial, it is better to use the DOM to create a tree structure for the data in memory.

3.2 SAX

SAX offers a lightweight alternative to DOM. It facilitates searching through a huge XML document to extract small pieces of informationand it allows premature aborting when the desired piece of information is found. SAX was designed for any task where the overhead of the DOM is too expensive.

However, the performance benefits of SAX come at a price. In many situations such as advanced queries, SAX becomes quite burdensome because of the complexities involved in managing context while processing. When this is the case, most developers either turn back to the DOM or some combination of SAX and DOM together.

3.2.1 Guidelines

Usage of SAX for parsing XML documents is recommended in the following situations:

Large XML documents

ENGG010_20 XML Guidelines

HCLT Confidential Page 13 of 40

The biggest advantage of SAX is that it requires significantly less memory to process an XML document than the DOM. With SAX, memory consumption does not increase with the size of the file. For example, a 100-kilobyte (KB) document can occupy up to 1 megabyte (MB) of memory using the DOM; the same document requires significantly less memory-using SAX. If one needs to process large documents, SAX is the better alternative, particularly if changing the contents of the document is not required.

Abort parsing

Because SAX allows aborting of processing at any time, it can be used to create applications that fetch particular data. For example, one can create an application that searches for a part in inventory. When the application finds the part, it returns the part number and availability, and then stops processing.

Retrieving small amounts of information

For many XML-based solutions, it is not necessary to read the entire document to achieve the desired results. For example, if one wants to scan data for relevant news about a particular stock, it's inefficient to read the unnecessary data into memory. With SAX, the application can scan the data for news related only to the given stock symbols, and then create a slimmed-down document structure to pass along to a news service. Scanning only a small percentage of the document results in a significant savings in system resources.

Creating a new document structure In some cases, one might want to use SAX to create a data structure using only high-level objects, such as stock symbols and news, and then combine the data from this XML file with other news sources. Rather than build a DOM structure with low-level elements, attributes, and processing instructions, one can build the document structure more efficiently and quickly using SAX.

DOM overhead is not affordable For large documents and for large numbers of documents, SAX provides a more efficient method for parsing XML data. For example, consider a remote procedure call (RPC) that returns 10 MB of data to a middle-tier server to be passed to a client. Using SAX, the data can be processed using a small input buffer, a small work buffer, and a small output buffer. Using the DOM, the data structure is constructed in memory, requiring a 10 MB work buffer and at least a 10 MB output buffer for the formatted XML data.

4 Processor Technology - Framework

4.1 .NET

In .NET, XmlNode-based trees are built via underlying XmlReader streams. These trees, however, remain in memory until the client is finished with them. DOM trees are completely

ENGG010_20 XML Guidelines

HCLT Confidential Page 14 of 40

dynamic and can be traversed in a variety of ways. Most developers working with XML are familiar with the DOM API since it has been around the longest. The DOM is considered the "easiest to use" by many developers but that simplicity comes with a cost.

4.1.1 GuideLines

XMLReader GuideLines

If the data is in XML 1.0 format, there is no choice but to use XmlTextReader to process the byte stream. XmlReader is faster than DOM. The methods used are ReadXmlSchema(), Load(), & LoadXml().

ENGG010_20 XML Guidelines

HCLT Confidential Page 15 of 40

Writing XML Documents

For generating XML documents there are two API choices: XmlTextWriter and the DOM.

Unlike XmlTextReader, XmlTextWriter is actually more intuitive for many developers than the DOM, especially for those used to working with SAX. Using XmlTextWriter to generate a document feels a lot like the SAX ContentHandler interface since a sequence of method calls represents a logical XML document. As with SAX, the key benefit to XmlTextWriter is that the resulting document doesn't need to be buffered in memory. Instead it can be written directly into the output stream as the document is generated. This makes XmlTextWriter much more efficient than the DOM and is quite easy to use.

Choices Pros Cons

XmlTextWriter -Fastest -Most efficient, not buffered -Familiar to SAX developers

-You can't manipulate the document (forward-only stream)

DOM -More flexibility in-memory-Familiar to DOM developers -Slower and less efficient since buffered in-memory

Load Data into Memory

There are only two in-memory structures available in .NET, XmlDocument (the standard DOM implementation) and XPathDocument (a tree optimized for XPath/XSLT). Deciding between these mostly depends on performance, programming ease, and future extensibility as illustrated by this decision tree.

ENGG010_20 XML Guidelines

HCLT Confidential Page 16 of 40

In terms of performance, XPathDocument offers XPath and XSLT optimizations. XPathDocument implements the XPathNavigator interface over its more efficient tree structure. XmlDocument, however, is typically easier to use since it implements both the standard DOM interfaces as well as XPathNavigator.

Also, writing code via XPathNavigator is generally good to take advantage of future extensions. Since XPathNavigator is much easier to implement than the DOM API, there are more likely to be new-and-improved custom implementations down the road.

With XpathDocuments, load performance does not scale well. Use XML readers/writers, not an XmlDocument, for simple manipulations of very large

documents.

Unnecessary use of CDATA sections will increase memory footprint.

Use XmlNameTables when doing extensive comparisons.

Use the XmlDataDocument for XML/DataSet integration. If source of xml is known don't use validation reader unless normal parsing fails.

Choices Pros Cons

XmlTextReader -Fastest -Most efficient (memory)-Extensible

-Forward-only -Read-only -Requires manual validation

ENGG010_20 XML Guidelines

HCLT Confidential Page 17 of 40

Choices Pros Cons

XmlValidatingReader

-Automatic validation-Run-time type info-Relatively fast & efficient(compared to DOM)

-2 to 3x slower than XmlTextReader -Forward-only -Read-only

XmlDocument (DOM) -Full traversal -Read/write -XPath expressions

-2 to 3x slower than XmlTextReader/XmlValidatingReader -More overhead than XmlTextReader/XmlValidatingReader

XPathNavigator

-Full traversal-XPath expressions-XSLT integration-Extensible

-Read-only -Not as familiar as DOM

XPathDocument -Faster than XmlDocument -Optimized for XPath/XSLT -Slower than XmlTextReader

ENGG010_20 XML Guidelines

HCLT Confidential Page 18 of 40

4.2 Java

4.2.1 Parsers

Guidelines

The following are some of the parsers available for parsing of XML files:

1. Crimson

2. Xerces

Among the two, Crimson performs better. Crimson is a straightforward implementation of an XML parser and has a small footprint: around 200KB (jar file size) while Xerces is more sophisticated and includes many additional features like XML Schema support. Xerces also comes with support for WML and HTML DOMs, which significantly increase the size of the jar file to 1.5MB. The Apache organization is in the process of re-factoring Xerces and addressing performance in Xerces2.

4.2.2 API

Guidelines

Among several existing Java API for processing XML files, JDOM seems to be the best. This API supports for both DOM and SAX methods. JDOM has been optimized for Java and moreover, by the use of the Java Collection API, it has been made straightforward for the Java developer. JDOM documents can be built directly from, and converted to, SAX events and DOM trees, allowing JDOM to be seamlessly integrated in XML processing pipelines and in particular as the source or result of XSLT transformations.

Use the interface, not the implementation

In the Interface-based models like the W3C's DOM, each interface has an implementation in the vendor's release. For example, the DOM interface org.w3c.dom.Document is implemented by the Xerces class org.apache.xerces.dom.DocumentImpl. However, all too often, we would see a code like:

DocumentImpl doc = new org.apache.xerces.dom.Doc