EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards v1[1].1.doc
-
Upload
mukesh-waran -
Category
Documents
-
view
7 -
download
2
description
Transcript of EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards v1[1].1.doc
SeeBeyond ProServ Generic Document Template
SOA with JCAPS at EGLArchitecture Principles, Guidelines and StandardsVersion 1.19/17/2007Version Description
AuthorDate CreatedLast UpdateVersion
Xi SongMarch 30, 20071,0
Oziel HinojosaAugust 29, 20071.1
Description
1.0 Initially created by Xong per Oziels request.
1.1 Revisions
* Modifying document for use by HCL developers. Per OTM lead request.
* Added event driven principles. * Added additional SOA guidelines
* Added the following standards:
Development, Coding Conventions, Exception Management, Code Management, Code Commenting, Code Review, Unit Test, System Integration Testing, XML, and JMS Settings.
sign off review
ResourceReviewerDateVersionSignature
Functional Review
Technical Review
QA Manager
Developer
Project Manager(s)
Distribution
OrganizationNameDateVersion
TABLE OF CONTENTS
2Version Description
4sign off review
71introduction
71.1Purpose of the Document
71.2Intended Audience
71.3Related Documentation
92Architecture Principles
92.1Service Oriented Architecture
92.1.1Service Modularity and Loose Coupling
102.1.2Service Security
102.1.3Service Granularity
102.1.4Exception Handling in Services
112.2Event Driven Architecture
123Guidelines
123.1Layers of Services
123.2Services are Discoverable and Dynamically Bound
133.3Service have Network-Addressable Interfaces
133.4Service Exposure
133.5Use of SOA Patterns
133.6Security Approaches
143.7JCAPS Environments
154standards
154.1JCAPS Project Structure
154.1.1Non-ESB Project Structure
164.1.2ESB Project Structure
174.2Canonical Message Envelope
194.3Common Services Framework (CSF) Customization
204.4Development Standards
204.4.1EAI Team Development Process
214.4.2eGate Development
214.4.3eInsight Development
214.5Coding Conventions
214.5.1JCAPS
214.5.2Java Code
224.6Exception Management Standards
224.6.1Exception Levels
244.6.2Exception Notification Channels
244.6.2.1Alerts Agent
244.6.2.2CSF
254.6.3Logging Guidelines
254.7Code Management Standards
254.7.1Version Control
264.7.2Migration Process
264.7.3Backup Process
274.8Code Commenting Standards
284.9Code Review Standards
284.10Unit Testing Standards
294.11System Integration Testing Standards
294.12XML Standards
294.13JMS Settings
304.13.1Message Properties
304.13.2Client Properties
314.13.3IQ Manager Runtime Configurations
324.13.3.1Stable Storage
324.13.3.2Journaling and Expiration
334.13.3.3Throttling Properties
334.13.3.4Special FIFO Mode Properties
334.13.3.5Time Dependencies
344.13.3.6Security
344.13.3.7Diagnostics Page
354.13.3.8Miscellaneous Page
354.13.3.9Backing Up Topics and Queues
1 introduction
This document outlines the architecture principles, guidelines and standards to be followed in future service oriented architecture (SOA) designs at Eagle Global Logistics (EGL) using Sun Microsystems Java Composite Application Suite (JCAPS) as the development and integration framework. This set of principles, guidelines and standards are based on the SOA best practices provided by Sun and are customized for EGL to accommodate its system and network infrastructure, organizational structure and business characteristics.
1.1 Purpose of the Document
The purpose of this document is to serve as a starting place for SOA architects and developers at EGL. By familiarizing with the various design principles covered in this document and following the guidelines and standards listed, an architect can design services and systems that help shape and drive the SOA strategy at the enterprise level. Developers can use this document to make sure that the services they develop comply with the enterprise guidelines.
1.2 Intended Audience
The document is designed to guide EGL architects and developers as they work on future SOA projects using JCAPS. The reader is advised to familiarize themselves with the principles, guidelines and standards described in this document and apply these to their design of JCAPS applications following SOA. 1.3 Related Documentation
Throughout this document, the reader is referred to the Implementation Guidelines documents found in RQ 3.1 and other documents for a more comprehensive discussion of the principles, guidelines and standards covered in this document. For details on the Java CAPS suite of products, refer to the User Guide of each product.List 1.0 Document Location
Document CategoryLocationAnnotation
RQ - Sun Microsystems Repeatable Quality Integration Process\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\RQ\SOA RQ3.1\home.htm Open the home.htm page to view the RQ via a web based tool.
\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\RQ\SOA RQ3.1\BestPractices Best Practices documents
\\eglsfps01\PM Office\\ITProjects\100798 - See Beyond\SUN CAPS\RQ\SOA RQ3.1\flash\implementation_guidelines\IG SOA Strategy.pdf Base SOA architecture document.
EAI Team Development Process\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\EAITeamArchitecture.vsd Open the EAIDevelopmentProcess.vsd file to view the development process flow.
TBC - Environment Security Document Mark Collins to develop
TBC - Production Deployment Manual
\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\EAITeamArchitecture \EGL Docs - Integration with JCAPS - Naming Conventions.doc
TBC Alert configurations Mark Collins to develop
Sun Fastrack Best Practices\\eglsfps01\PM Office\ITProjects\100798 - See Beyond\SUN CAPS\EAITeamArchitecture\Mentoring\Java CAPS Core Components Security.ppt
2 Architecture PrinciplesFor business and IT decision makers, technical solutions must support event based messaging between applications and application tiers using services based integration, and industry standards based deliverables. By establishing these architectures, EGL will provide its customers, business users, and organization the most flexible, scalable, integrated architecture possible.
2.1 Service Oriented ArchitectureTo have a service oriented architecture (SOA) implies to pursue business and technical strategies which promote the exposure of business functionality within and between enterprises in a consistent, published, secure and contractual way. The related document, IG SOA Strategy, describes and explains 17 SOA principles as follows:I. Modularity
II. Interoperability
III. Loose coupling
IV. Support of orchestration and composabilityV. Discoverability and dynamic binding
VI. Location transparency
VII. Security
VIII. Self Healing
IX. Versioning
X. Lease
XI. Network-addressability
XII. Coarse-grained interface
XIII. Metering
XIV. Monitoring and Control
XV. Exception handling
XVI. Separation of interface from implementation
XVII. Published quality of serviceMany of these characteristic are automatically provided when using JCAPS as the development platform. Others are design decisions which the architect must be aware of. This section of the document reviews some of these SOA characteristic that are most relevant to EGLs business environment. Designing services that possess these characteristics therefore becomes EGLs guiding principles for SOA architects.2.1.1 Service Modularity and Loose CouplingTwo of the most important aspects of SOA are modularity and loose coupling. With SOA, an application can be decomposed to a set of smaller modules each responsible for a single, distinct function within the application. On the other hand, a modular service can be used in multiple applications to fulfill the same functionality. When a service is modularized it directly maps to distinct problem domain function, is easy to understand and has limited and predictable impact to the other services. Service modularity should be the first and foremost designing principle in EGLs SOA effort.
In addition to modularity, services should be as loosely coupled as possible. In a loosely coupled system, a service consumer has only a few well known dependencies on the services it consumes, removing it from the need to know unnecessary details of the service providers.
In the JCAPS environment, modularity and loose coupling mean proper project structure and well-thought component location. For example, common components should be placed in common project folders and reference by name should be used to avoid artificial project dependency.
2.1.2 Service Security
EGL has made it one of its designing principles to secure both the JCAPS framework and the applications developed with the framework. This includes centralizing user authentication using the enterprises LDAP server, limiting access to JCAPS components using access control lists (ACLs), and securing all JCAPS-developed applications including web services and web applications. It is also the EAI teams intension to enable single-sign-on of all web applications developed using JCAPS and to secure all web services using Sun Java System Access Manager. Java CAPS Core Components Security.ppt provides a high-level overview of the security features in JCAPS development.2.1.3 Service GranularityThe granularity of service is a crucial design decision. If service granularity is not properly predicted, service consumers could have access to more functionality than they actually need. This can result in many problems, such as security concern, network traffic, etc. The EAI team has decided to take a layered approach. Services will be created at different levels with different granularities. When necessary, a coarse-grained service may be created using multiple fine-grained services. This approach adds some complexity to the system of services, but it allows better control on service security, performance, and functionality.2.1.4 Exception Handling in ServicesIt should be of high priority to consider exception handling in services. Exception handling in services promotes self healing of services and helps minimize the impact of a failed service. It is expected that all services take advantage of the Alerting, Logging and Error Handling common services. This set of common services are provided by the Sun SeeBeyond Common Service Framework (CSF) and will be customized to suite EGLs exception handling strategies. In addition, exception handling should be a required section in any service design document and developers should make it a standard practice to review error handling logic of the services they develop.2.2 Event Driven ArchitectureAn event driven architecture is an approach to system integration that calls for software applications and hardware devices to produce, consume, detect, and react to events. Thus producing greater responsiveness in a system and allowing applications to react intelligently to changes in conditions. An event is a significant change in state. For example, when an item is picked up by EGL from the shippers facility, the item state changes from ready to pickup to pickup. An EGL system may treat this state change as an event to be produced, consumed, detected, or reacted to.This is architectural approach is to be used when designing applications that need to perform activities in real-time, activate long running services, in long running asynchronous business processes, spawn multiple processes or multiple services, and in the building of a messaging system.Three processing styles of EDA exist. The three styles are Simple, Event Stream, and Complex. Simple event processingSimple event processing concerns events that are directly related to specific, measurable changes of condition. In simple event processing, a notable event happens which initiates downstream action(s). Simple event processing is commonly used to drive the real-time flow of work, thereby reducing lag time and cost. For example, simple events can be created by a sensor detecting changes in tire pressures or ambient temperature.
Event Stream ProcessingIn Event Stream Processing (ESP), both ordinary and notable events happen. Ordinary events (orders, RFID transmissions) are screened for notability and streamed to information subscribers. Stream event processing is commonly used to drive the real-time flow of information in and around the enterprise, which enables in-time decision making.2Complex event processingComplex event processing (CEP) allows patterns of simple and ordinary events to be considered to infer that a complex event has occurred. Complex event processing evaluates a confluence of events and then takes action. The events (notable or ordinary) may cross event types and occur over a long period of time. The event correlation may be casual, temporal, or spatial. CEP requires the employment of sophisticated event interpreters, event pattern definition and matching, and correlation techniques. CEP is commonly used to detect and respond to business anomalies, threats, and opportunities. 2Principles
Responsive
XML Standards Based
Canonical3 GuidelinesThis section lists some guidelines for architects on how to follow some of the principles including those described in sections two. 3.1 Layers of Services
At EGL, a layered approach is taken so that services are created with the proper granularity. From a high level, there are two categories of services: business services and technical services. Technical services are self-contained modules that provide a low level functionality where business services are relatively coarse-grained modules fulfilling a business function. In each category, it is possible that multiple layers of services exist such that some higher-level services group multiple fine-grained services to provide a service with coarse-grained interfaces for the purpose of finer control of functionality and security. In general, the following guidelines can be used when creating service in each category:1. Use JCDs for low level services, especially those in the technical service group. On the other hand, use business processes only for high level services.
2. When (and only when) it is not possible to predict the service consumer requirements, take a layered approach: create multiple fine-grained services and group them in layers to provide coarse-grained service. Keep in mind that such service composition does have an impact on service performance.
3. When creating a higher-level service using a group of fine-grained services, design the composite service in such a way that it requires as little knowledge of the component services as possible. Leave the binding of any service to until runtime so that any component service can be replaced without affecting the composite service.3.2 Services are Discoverable and Dynamically BoundServices should be published to a registry so that service consumers can dynamically locate and bind themselves to the service. This will promote a more loosely coupled environment and allow service consumers to select the most appropriate service at runt-time. 1. Business processes shall publish there WSDL files to the UDDI registry.
2. Composed services shall publish there WSDL files to the UDDI registry.a. Finer grained services that act on technical requests shall be statically created. E.g. OTDs.3. Service consumer shall locate service providers through the registry.
4. Service consumers shall not be aware of the service provider location.
5. Service consumers shall dynamically create the requests.6. Service consumers shall dynamically bind to responses.
3.3 Service have Network-Addressable InterfacesA service must have a network-addressable interface. A consumer must be able to invoke a service across the network. The location of the service shall be location independent.1. JCAPS supports multiple communication protocols.
2. Exception to location independent services is between the ESB JMS components.3.4 Service Exposure
In order to promote service reusability, the same service can be exposed via multiple protocols. The additional exposure often takes the form of an interface adaptor. However, some services are wrapped into a JCD providing an easy way for the service to be invoked in a business process. Different options exist for exposing the same service via multiple protocols:
1. Develop the service and expose it as a JMS service. Create another JCD or business process with a web service interface that invokes the first service.
2. Develop the service and expose it as a web service. Add another service (adaptor) that is exposed via a JMS interface but invokes the first service.
When both options are available, the first option has the added advantage of higher performance when the service is invoked by a JMS service consumer.
3.5 Use of SOA Patterns
When properly used, SOA patterns help promote reuse of code and design ideas, facilitate communication and simplify code maintenance. The EAI team has decided to take these advantages of SOA patterns and has started compiling a library of SOA patterns as references. It is also expected that the team will make it a standard practice to identify new patterns and add them to the library together with optimal solutions. The following guidelines can be used by architects and developers when working with SOA patterns:1. Familiarization with existing patterns to understand the patterns that are already built into JCAPS.
2. Whenever wonder how to do certain things in developing a service, ask whether someone else has asked a similar question in different a project. In other words, ask whether the problem at hand falls into a known pattern and therefore already has a known solution.
3. Use the Sun CSF projects as examples.
4. Communicate to the entire team when a pattern is identified and an optimal solution adopted.3.6 Security Approaches
The following approaches will be taken to secure the JCAPS systems and the applications developed with it:
1. Core components: refer to the environment security document on approaches to core component security.2. Web services: web service will be secured with the following features supported by JCAPS:
a. User authentication
b. WS security
c. Transport layer security: SSL
3. Use of Sun Java System Access Manager: the AM will be used for single sign on for web applications and for additional security features for web services. It should be noted that additional custom development is required for these features.3.7 JCAPS Environments
Every developer should maintain their own JCAPS environment in eDesigner. When the following guidelines are followed, developers will be able to unit test their development with minimum interference:
1. Each developer maintains a JCAPS environment in eDesigner. Within this environment, a logicalhost should be mapped to a runtime logicalhost domain which is also maintained by the developer. The advantage of each developer maintaining his/her own logicalhost is that only the services of his/her interest need to be deployed to the logicalhost domain so that it can be run much more efficiently than if all developers share the same logicalhost for unit test.
2. If hardware resource availability allows, each developer should install and control their own Enterprise Manager and use it to manage his/her own logicalhost. The advantage, again, is easy of control and minimum interference.3. The development test environment is coordinated by the team lead or the JCAPS administrator. A developer deploys his/her service to the test environment when and only when the service has passed unit testing. The end result is that the development test environment only hosts components that are in a working state such that continuous system integration testing can be conducted.
4. The QA test environment and the production environment should be strictly controlled by the JCAPS administrator who is solely responsible for any code promotion and service deployment into these environments. However, developers should provide the administrator detailed environment configuration information as well as service implementation notes including JCAPS project dependencies in order to help the administrator carry out these tasks. Refer to the production deployment procedure manual for additional information.4 standardsThis section covers some of the standards to be followed in the design, implementation, and testing of future JCPAS projects at EGL. 4.1 JCAPS Project Structure
The EAI team has established a JCAPS project folder structure to facilitate service modularity and decoupling. Projects for non-ESB services are grouped by application, such that each application contains projects of business services and technical services. This grouping helps keep the project tree from growing too wide. It does not necessarily create a project dependency.Projects for ESB services are grouped individually, such that there is no demarcation of business services and technical services. This helps limit the deployment complexities. Naming conventions will help distinguish between the types of services.Common components used across the enterprise will have project folders named Common. For example the OTDs representing the canonical message structure will be placed under a project named prjEnterpriseCommon. Another example of common objects is the CSF libraries.
Within this common project all OTDs composing the canonical message structures are created and stored under another project folder named OTD. Projects that depend on these OTDs will reference the files versus copying the OTD files to the local project structure. This approach creates a tightly coupled dependency between the enterprise objects and the more specialized services. This tight coupling will enforce that one of two things occur. Either the specialized services upgrade when there is a change to the enterprise objects and maintain backward compatibility. Or the specialized services employ versioning as to use older versions of the enterprise objects. Special case must be taken in the later option. 4.1.1 Non-ESB Project StructureNon-ESB services can be defined as components that provide application services that are either specific to the application or is an enterprise business service which is embedded within the application .
Below is a sample project, prjClassMX, structure for non-ESB services.
4.1.2 ESB Project Structure
An ESB services can be defined as a component that provides an enterprise business service which is external to any legacy application. The service can be business or technical in nature.Below is a sample project, prjTranslateShipmentToEDI304, structure for an ESB service.
The above project forms a technical service. This technical service will translate the enterprise shipment event structure into an EDI document. This technical service would be aggregated into a higher level service known File Shipping Instructions. The objects in the project would be independent from any other projects exception for the common enterprise projects.4.2 Canonical Message EnvelopeThe EAI team has designed an enterprise wide canonical message structure for SOA services. As shown below, this message envelop contains many useful field that helps facilitate message tracking, performance logging, and status reporting. The message envelope also alleviates the need to parse the payload message on each service consumer, allowing for better exception reporting when an un-marshalling exception is encountered with the data in the data area element. This technique also allows for improved performance on components that perform simple pass through operations, such as routing and filtering components.The standard way for a service to use this structure is as follows:
1. Use the DataArea node for message payload and the ApplicationArea node for message enveloping.
2. An initiating service, the service that is the first in a transaction to use this message envelope, sets the source information in the message envelop. It also sets information in the message header.
3. Each service, after receiving a message with the canonical message envelop, should not alter the envelope in any way other than adding an instance to the Service node and setting the overall status (the Status node outside the Service node) on return. This is detailed below.4. When processing a message, a service should add one and only one instance to the Service node, setting the service name and host name (where the service is running from). The start time and the stop time should be set at appropriate times to record the time spent processing the message.
5. One or more statuses can be reported for each service. If statuses for multiple steps are to be reported, they should be added to the Service/Status node in the order the steps are taken.6. On return, the overall status should be set. This status is checked by the service consumer to indicate the processing result of the service.
7. Service request and response data are passed via the DataArea node as message payload.
4.3 Common Services Framework (CSF) CustomizationTo facilitate consistent use of CSF, the EAI team has developed a CSF client package that customizes the CSF client API provided by Sun.
The customizations include:
1. Standardization on a set of ALE codes and publication of a guideline on the use of these codes.
2. Selection of the appropriate CSF Client APIs or convenience APIs to use.
3. Implementation of a set of customized ALE APIs for use within JCAPS and by stand-alone Java clients. This layer customizes the ALE reporting logic in a centralized module so that all services use the ALE services in a consistent way.
Each developer will only use the customized APIs as described in step 3 above. The end result for the developers is a set of much simpler APIs to use and consistent Alert, Logging, and Error (ALE) reporting across the enterprise.The values used to report exceptions are described in the Error Handling Standards section of this document.
The CSFLogger class has been created to simplify the interface to the CSF API. It handles CSF related object creation and wraps a subset of common ALE methods. Calls to public constructor and methods may throw a CSFException and should be thrown up the stack to the core JCAPS classes allow a request to be automatically be rolled back As a guideline to our Exception handling practices when an Exception bubbles up to the JCD trigger method (e.g., receive()), it is interpreted as an action to rollback the current JMS Message. To start using it, you need to add the jar from the Repository (located at prjEGLCommon\lib\eaiUtil.jar) to the classpath of your JCD. For more detailed information about this Class please follow the hyperlink at the beginning of this paragraph.4.4 Development Standards
4.4.1 EAI Team Development ProcessThe eAI development process is outlined in the following Visio document, EAI Development Process. The process outlined in the EAI Developer Responsibility band shows the activities that an EAI developer must execute in order to develop integration components. The development process starts with high level design using UML and ends with unit testing. Each activity in the Responsibility band also lists the deliverables an EAI developer must produce before exiting an activity. Below the name of the activity is a reference to the deliverable. The deliverable themselves are referenced in the Reference Documentation band. The document icons in the Reference Documentation band have links to the actual deliverables. In most cases a deliverable is document template a developer must fill-in. The other bands are there to show dependencies on other activities that should be performed by a different team, best practices work shops an EAI developer must be familiar with before developing integration components, and prerequisite items an EAI developer must have in order to gain access to the development environment. Below is a list of the additional bands and their purpose. 1. Pre-Requisite
a. Communicate that an individual wishing to use JCAPS for development must have had formal training, familiar with SOA, familiar with RQ, have read this document, under stand the environments onto which JCAPS is installed, familiar with integration patters, and business process modeling.
2. Other team member responsibilities
a. List the activities and artifacts that other team members must produce for an EAI developer.
i. An EAI developer can be that other team member, but the activities listed in this band are not the focus of EAI Development.
3. Mentoring
a. Best practices workshops a team member must have attended or be familiar with the guidelines in the workshop.For large enterprise wide projects follow the RQ Methodology, at ones discretion. Documents of importance to EAI team members are the Component Specification , Composite Application Component Architecture, Composite Application Deployment Architecture, and SOA System Architecture.4.4.2 eGate Development
Follow the link provided below to familiarize yourself with Suns best practices for e*Gate development.BP eGate Development4.4.3 eInsight Development
At the time of this writing eInsight development was in its infancy at EGL. Because of this no development standards are available. Sun has provided best practices for developing eInsight solutions. Please refer to the following documents for additional information; eInsight_UG.pdf and CAPS_Deployment_Guide.pdfNote; using correlation in eInsight components removes the ability to scale and provide failover. Careful consideration should be taken before using correlation. 4.5 Coding Conventions
4.5.1 JCAPSThe EAI team has published a naming standard guide based on Suns best practices. This guide, EGL Docs - Integration with JCAPS - Naming Conventions, is to be followed by developers when adding components to the JCAPS repository.
4.5.2 Java CodeThe EAI team will use Sun Microsystems a coding convention guidelines. This guide, conventions used on the Java Programming Language, is to be followed by developers when constructing integration components.
4.6 Exception Management Standards
The following sections will outline the standards for exception management to be used by JCAPS developers. As a general rule a developer must familiarize themselves with Suns best practices on exception management by clicking on the following link, BP Error Handling Guidelines. As this document forms the basis for the exception management standards listed herein. Also refer to Suns Java standards on throwing and catching exceptions within Java code. Business processes, business rules in particular, should also follow the standards listed here. This document does not cover compensation activities since it deals with rolling back to a previous state and not handling and reporting errors within components.The section provides information on the following topics, exception levels and exception notification channels. There aim is to provide a comprehensive set of standards for developer to follow when developing JCAPS components and integrating with the CSF.A few rules before proceeding. Rule, all exceptions will be reported to the CSF.Rule, no interface is to be turned off with out operations manual involvement. No code shall be developed to shut down a component that does not require user validation first. Rule, the CSF is an integral part of a component. All interactions with the CSF must complete successfully, if not the component shall roll back the transaction.
Rule, do not create infinite loops. Always remove messages that do not comply with the message structure, i.e. Un-marshal exceptions.Rule, always catch finer grained exceptions first and deal with them. Do not have a single catch all exception clause. Having a single catch will lead to termination.Rule, if a system exception is thrown by the CSF, operations must be alerted and the issued resolved immediately.
Rule, proactively take measures to prevent system failures. For example, if you expect a rows from a query, then check to see that rows exits before executing the next method or raise a business rule exception.Rule, do not handle unchecked exceptions as they are bugs in the code. Report them to the CSF and re-throw the exception.4.6.1 Exception LevelsBelow are the exception levels to be used by JCAPS developers. These levels are to be used to categorize the exceptions that can be thrown within JCAPS components. Each exception will have an error code associated to it and each error code will be assigned an exception level.
For a list of error codes and descriptions follow the following link #########.Exception Levels
FATAL
Exceptions with this level represent un-recoverable system failures. The majority of the exceptions will be thrown through system level mechanisms such as alerts. Types of system failures are out of memory, shutdown requests on components, can not start, message to large for queue, etc. This type of level requires immediate attention and resolution. CRITICAL Exceptions with this level represent either program bugs or requests that can not be processed because of an invalid message structure or invalid types. Service not available exceptions also fall into this level when the client receives timeouts, incorrect security credentials, login information, I/O, etc. Business rules validations and failed transformations do not fall into this level.
This type of level requires immediate attention and resolution. The unit of work that the component was working on at the time the exception occurred must be stopped, removed, and made available for re-processing. A new unit work must commence. ERROR Exceptions with this level represent business rule violations and transformation failures. The interface design will determine if the unit of work should terminate entirely or just report the error and continue processing the remaining lines in the unit. If multiple business violations are encountered within a unit work then report all violations as a single transaction (???). Messages can be made available for re-processing. But typically they are not. WARNING Exceptions with this level are used to warn that an exceptional condition could occur because of system resources starting to deteriorate or components have identified that a business policy could be validated. Examples of these are queues reaching there thresholds, memory reaching a predetermined limit, requests for services not occurring in a timely manner, etc.4.6.1.1.1 Error Codes GroupsError codes will be classified into groups to provide better error reporting and recognition.
Database Programming Language Software System
Hardware System
Business Rule Violation Message Format
Request Format Unknown
New java programming errors that were not accounted for.
Unknown errors should be researched and classified in a timely manner so that the system can be configured to properly report the error.4.6.2 Exception Notification Channels
This section outlines the means that will be used to communicate exception information. It will also outline the required information to be communicated for exception reporting.At stated earlier in this document the Common Services Framework is to be employed for alerting, logging, and error reporting (ALE) functions. This is where all exceptions will be recorded, providing a centralized area for error mining, escalation procedures, and resolution details. Along with the CSF, JCAPS developers will employ the alert agent to capture and communicate severe system conditions to the CSF and system monitoring tools via SNMP. 4.6.2.1 Alerts Agent Alert agent will be enabled. All JCAPS projects will be monitored via the alert agent for system level notifications. Notifications that require resolution or indicate a severe exceptional condition will be made available to the CSF and system monitoring tools.
The alerts will be sent to the CSF via JMS and SNMP to system monitoring tools.
Predefined alert levels that are to be monitored and reported are listed below. For specific alerts and what actions need to be taken see the system adminstrator. Fatal All alerts with a level of FATAL will be transmitted to the CSF and system monitoring tools. Critical All alerts with a level of CRITCAL will be transmitted to the CSF and system monitoring tools Major
Minor
Warning
4.6.2.2 CSF
This section outlines the information that is to be capture for CSF reporting. Error Codes (Configured)
All exceptions that are thrown will have an associated custom error code. All alerts that are raised will also have a custom error code.
The Message Identification should be populated by the transaction identifier.
Application information should be populated from the application area of a transaction.
The payload of a transaction needs to be captured.
4.6.3 Logging GuidelinesAll consumers will log into the CSF the time it receives a transaction. This logging is independent of the business logic; i.e. forming its own unit of work.4.7 Code Management Standards
4.7.1 Version Control
The Sun Java Composite Applications Platform Suite provides developers with tools to track different versions of their application's components. A release engineer may then take a snapshot of the workspace that includes a version of each component, and then can use that snapshot to create a release for deployment.
The JCAPS versioning system is based on CVS, and as such offers similar functionality. Components within a project or an environment may be checked out, ensuring that only the developer who checked the component out can make changes. These changes may then be committed by checking the component in, or discarded by undoing the checkout. When components are checked in, the developer may provide a description of the change. A history of these change descriptions can be retrieved on any versioned component. A much more detailed discussion of the version features is described of the eGate Integrator User's Guide.
The versioning system supports tagging of components. This is simply a way of identifying a version of each component with a common name, which is useful when identifying which versions to include in a release. In the typical environments, releases will be tagged using the following format:
M-m-p-stage
Where:
M - The major version number. This should be incremented when a release contains significant new features or functionality.
m - The minor version number. This should be incremented when a release contains bug fixes and less significant new functionality.
p - The patch number. The patch number is incremented when changes are made to a release that has been deployed to the production environment.
stage - The release stage. This is one of alpha, beta etc.,.
All development is done on the trunk. This is labeled as HEAD in the JCAPS repository. Developers make changes to the project components until the application is ready for deployment to the alpha environment. At this time, all components are checked in and the release engineer tags the current versions of all components with the M-m-0-alpha tag, where M and m correspond to the major and minor release numbers of the release being built. The release engineer then builds the applications and deploys it to the alpha environment for testing. When all changes needed to fix bugs found in alpha are checked in, the release engineer will tag the current version of each component with the M-m-0-beta tag. Again, the application is built and deployed into the beta environment. As with alpha, changes are made to fix any bugs found during beta testing, and are checked in. The release engineer then tags the components with the M-m-0-fcs tag and builds the application and deploys it into production.Once an application has been deployed into production, development begins on the next release. If a change is needed for a release that is already in production, then a branch must be created. The branch will be named after the release that it corresponds to (eg. 1-0). Changes for the patch are checked into the branch and are tagged with alpha, beta tags as is done for a regular release. If developers would like to make changes for a future release, a prototyping branch will be created by the release engineer. Changes for the future release are checked into the prototype branch until that release becomes the current release. At that point, the versions on the prototype branch will be merged into the main trunk and regular release versioning will be used.NOTE: An exported project does not retain its versioning information, so do not export a project if you require versioning information in the repository to which it is restored.
4.7.2 Migration Process
Build Procedures
During development, developers will build the project as needed with the Enterprise Designer tool.
For Alpha, Beta and Production releases, a release engineer will build the project using the versions of the components labeled with the appropriate release tag. The release engineer may build the project via CommandLineCodegen tool or Enterprise Designer build functionality.
A package is then created for the application that includes the ear file, an export of the project, and enough information about the development environment such that it could be recreated, if necessary. Export and Import
Projects may be migrated from one repository to another by exporting from one and importing into the other. Since exported projects do not retain their versioning information, however, it is recommended that projects only be migrated when past versioning information is not needed. For example, migrating a project from the Sandbox [used for experimentation] to the Development repository in order to build upon a prototype would be a suitable migration path. Migrating a CSF project from the CSF repository to the Development repository would also be suitable if the CSF project will only be used by a development project, and not further developed.
4.7.3 Backup Process
Repository Back-up and Restoration
The Development environment's repository may be backed up daily via a cron scheduled backup script executed between and . [Or]A backup of the repository may be restored manually by executing the restore script bundled with the JCAPS repository.
Project Back-up
Individual projects in the Development environment's repository may be backed up via a cron scheduled backup script executed between and . A configuration file will be maintained that lists the projects that are to be exported from the repository for backup.A backup of a project may be manually restored by executing the project import script that is bundled with the JCAPS repository.
Provided by HCL.
4.8 Code Commenting Standards
Software documentation exists in two forms, external and internal. External documentation is maintained outside of the source code, such as specifications, help files and design documents. Internal documentation is composed of comments that developers write within the source code at development time.
Following are the recommended commenting techniques:
When modifying code, always keep the commenting up to date.
At the beginning of every routine, it is helpful to provide standard, boilerplate comments, indicating the routines purpose, assumptions, and limitations. A boilerplate comment should be a brief instruction to understand why the routine exists and wait it can do.
Avoid adding comments at the end of line of code; end-line comments make code more difficult to read .However, end-line comments is appropriate when annotating variable declarations. In this case, align all end-line comments at the common tab stop.
Avoid using the clutter comments, such as an entire line of asterisks. Instead use white space to separate comments from code.
Avoid surrounding a block comment with a typographical frame. It may look attractive, but it is difficult to maintain.
Prior to deployment, remove all temporary or extraneous comments to avoid confusion during future maintenance work.
If you need comments to explain a complex section of code, examine the code to determine if you should rewrite it.
Use complete sentences when writing comments.
Comment as you code, because most likely there wont be time to do it later.
Avoid the use of superfluous or inappropriate comments, such as humorous sidebar remarks.
Use comments to explain the intent of the code.
To prevent recurring problems, always use comments on bug fixes and work-around code, especially in a team environment.
Use comments on code that consists of loop and logic branches.
Throughout the application, construct comments using a uniform style, with consistent punctuation and structure.
Provided by HCL.4.9 Code Review Standards
Please refer the Code review template PSPF018-PR Code Review template
Provided by HCL.4.10 Unit Testing Standards
Test plan covers unit testing, integration testing, load testing and production testing. It is developers job to fill out the unit testing section. The rest of them should be filled out by a test engineer. Test plan should be constantly updated as a project moves into the validation phase.
Besides test cases, a test plan should also document the entire procedure of setting up the test environment and conducting the test. Each test case should include following information: a reference number, use case reference number, input and output.
Tip: Test data should cover all possible scenarios, not just the Easy path.
Automated test cases are developed during the implementation phase of the project. They include both unit tests, end-to-end integration tests and load/performance tests. Furthermore, for each bug that is fixed in the system, a test case must be created to expose the bug, and then test for the fix.
As the name implies, all tests must be automated and test data retrieval should be made automated also. A person should be able to execute all test cases with a single mouse click.
For each Seebeyond business process, create a test business process for it. For each Seebeyond deployment, there must be a corresponding test deployment. Use a JMS receiver at the top level to trigger all tests. Keep the test cases updated. It may seem time consuming to use automated test cases, but they are extremely useful. Once the testing infrastructure is in place, it is easy to add more test cases and execute them. We use automated test cases to validate and certify a system.
Please refer the template for ENGT559-Unit_Test_Case_Identification
Provided by HCL.4.11 System Integration Testing Standards
Please refer the template for SIT plan ENGT560-System_Integration_Test_Plan
Provided by HCL.4.12 XML Standards
Please refer the XML standards/Guidelines ENGG010-XMLGuidelines
Provided by HCL.4.13 JMS Settings This section outlines the relevant configuration settings available to JMS message servers and JMS clients and the recommend standard settings. The standard settings provided herein are based from Sun recommendations. JCAPS architects and developers must size the components capacity requirements and adjust the settings to meet the processing needs of the solutions they will be providing. As most of the settings are design time decisions and runtime decision.
For additional information on how to best configure a JMS provider click on the following links, Java CAPS Best Practices Workshop_Performance.ppt and Java CAPS Best Practices Workshop_Scalability and HA.ppt.As a standard the redirect and redelivery options must always be employed. The recommend values are listed below in the Client Properties section.JMS clients provide the local messaging protocols, such as message persistence and
delivery semantics, for messages being propagated between Project components.
Together with the JMS message server, they constitute a JMS provider.
JMS clients are of two basic types: producers and consumers (or a combination of both). If
associated with a queue, these become queue senders and receivers, respectively. If
associated with a topic, they become topic publishers and subscribers, respectively.4.13.1 Message Properties The default message properties should be used. The exception is when using the request/reply features of JMS. Settings for request/reply are design time decisions. At this time it is not recommended to use the request/reply model for JMS. If synchronous behavior is to be desired approach then a web service call needs to be employed. All properties at the message level are design time decisions. CategoryPropertySettingAnnotations
HeaderCorrelationID
CorrelationIDAsBytes
DeliveryMode
Destination
Expiration
MessageID
Priority
Redelivered
ReplyTo
Timestamp
Type
Additional OutbounddeliveryMode
priority
timeToLive
destination
MessageServerURL
4.13.2 Client Properties
Client properties are either applicable to a consumer or producer, but not both. With the exception of the redirect and redelivery properties all other client properties are design time decisions. CategoryPropertySettingAnnotations
RootDurable Subscriber
NameNaming Convention:
Applies only to Topic subscribers
BasicConcurrencyService Oriented Integration : Connection Consumer Asynchronous processing onlyThe Connection consumer setting is a recommendation.
For queues, it is also possible to use connection consumer for concurrent processing on
multiple CPUs (and application servers) on a system.Used with the IQ Managers delivery mode settings.The eGate
Integrator JMS implementation enables you to configure topic subscribers as
connection consumers to improve message throughput through concurrent processing
Information Oriented Integration: Design Time Decision
Business Process Oriented Integration: Design Time Decision
Delivery mode
Idle timeout
Maximum pool size
Maximum wait time
Message selector
Priority
Steady pool size
Transaction mode
RedeliveryDelay2:1000 ; 3:3000 ; 5:move($_DeadLetter)
The Progressive delay setting should be set a the JMS manager level. ???You can specify the actions you want to be taken after a message has been rolled back
by appending a redelivery-handling string to the message server URL when you
configure the JMS IQ Manager. These actions then override the default actions for all
JMS clients interacting with the JMS IQ Manager.
Move/Delete After N
Times5
Actionmove
Move to Queue/
TopicAuto
Move to Destination
Name$_DeadLetter
AdvanceDurability
Applies only to Topic subscribers
Server session batch
size1
Server session pool
size
4.13.3 IQ Manager Runtime ConfigurationsThe IQ manager properties listed below are runtime configurations for the JMS manager residing within an integration server. Properties include, message delivery order, tuning configurations, journaling options, and diagnostic options. Some properties work in conjunction with client properties and affect the behavior of the consumers and producers. For detailed information about the properties listed below see the eGate JMS Reference guide.Please note that modifying some of the properties will require that the integration server be stopped and re-started in order for the new values to take effect. Some changes may require that dependent objects or underlying objects be deleted and recreated.4.13.3.1 Stable Storage
The stable storage configurations are used to specify the JMS database location, database size and record size.
It is preferable that the location of a database be on high performance storage device and separate from the software components. CategoryPropertySettingAnnotations
Stable StorageData DirectoryFor now use the default
Block Size
Segment SizeDesign time decisionPlease be aware that if a message is larger than the segment size the message will be discarded. Neither will an error be returned to the JCD. The Alert option must be enabled so that any alert message reporting this issue is trapped and a system administrator notified immediately.
Minimum Number of SegmentsDesign time decision
Maximum Number of SegmentsResource capacity
Sync To Disk
4.13.3.2 Journaling and ExpirationThe stable storage configurations are used to specify the JMS database location, database size and record size.
CategoryPropertySettingAnnotations
Journaling & ExpirationEnable Message ExpirationEnabled
Maximum LifetimeThe default is 30 days. The hardware resources must be capable of achieving this.
Enable Journal
Journaling Maximum Lifetime
Journal Directory
4.13.3.3 Throttling Properties
These properties are used to manage memory and disk resources needed by the IQ Manager. This properties should be set based on system capacity and readjusted if system capacity increase or decreases.
CategoryPropertySettingAnnotations
ThrottlingPer-Destination Throttling Threshold100,000
Server Throttling Threshold100,000*Number of ProducersEnsure that disk space will be sufficient enough to support this requirement.
Throttling Lag25,000
4.13.3.4 Special FIFO Mode Properties
The default values for these properties should be used. They should only be modified if message sequence is key a requirement.CategoryPropertySettingAnnotations
FIFOFully Serialized Queues
Fully Concurrent Queues
FIFO Expiration Time
4.13.3.5 Time Dependencies
These properties are used to group queues and topics together.
CategoryPropertySettingAnnotations
Time DependenciesTime Dependency Topics
Time Dependency Queues
4.13.3.6 Security
Security must be enabled at all times.CategoryPropertySettingAnnotations
SecurityRequire AuthenticationEnabled
Default Realm
Enable File Realm
Enable Sun Java System Directory Server
Enable Microsoft System Directory ServerEnabled
Enable Generic LDAP Server
4.13.3.7 Diagnostics Page
Only log errors and above.CategoryPropertySettingAnnotations
SecurityLogging LevelERROR
Logging Level of Journaler
Maximum Log File Size
Number of Backup Log Files
4.13.3.8 Miscellaneous Page
Alerting is always enabled.
CategoryPropertySettingAnnotations
MiscellaneousEnable AlertEnabled
4.13.3.9 Backing Up Topics and Queues
Backups must be performed daily on JMS Managers.
Wikipedia.com Event Driven Architecture
Brenda M. Michelson, Event-Driven Architecture Overview, Patricia Seybold Group, February 2, 2006
EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards.doc Page 1 of 349/26/2007
1EGL Docs - SOA with JCAPS - Architecture Principles Guidelines and Standards.doc Page 5 of 359/26/2007
_1249817796.doc
ENGT559_20
EAI - Unit Test Case Identification
EAI - Unit Test Case Identification
Version No.:
Date:
Project Name:
Project Code:
Copyright Notice
This document contains proprietary information of HCL Technologies Ltd. No part of this document may be reproduced, stored, copied, or transmitted in any form or by means of electronic, mechanical, photocopying or otherwise, without the express consent of HCL Technologies. This document is intended for internal circulation only and not meant for external distribution.
Revision History
Version No
Date
Prepared by / Modified by
Significant Changes
Glossary
Abbreviation
Description
Table Of Contents
41Project Information
1 Project Information
Client:
Project:
Version:
Project Code:
Test Scenario Name:
Test Scenario ID:
Unit / Integration Test Plan ID:
Assumptions / Dependencies: Existing Customer Details
Interface Name
Connection Model Name
Testing Date
Tested by
Type of Test
F-First Time R-Retest
**Flow/Activity Name
Input Event
Output
Event
Result
Reason If failed
Resolution
Prepared By
Date
Reviewed By
Date
Approved By
Date
**->Flow/Activity not only applicable to each Flow and Activity of connection model but also it is applicable to uni-directional connection model / adapter as a whole.
HCLT Confidential
Page 3 of 5
_1249818004.xlsReview summary
Project CodeNameEffortWork ProductRevn. No.Work Product TypeSizeSize UnitWork product typeUnit
Project NameDocumentPages
Project TypeRose model - Class DiagramClass diagrams
PM NameRose model - Seq. DiagramSeq. diagrams
Phase of ReviewRose model - Act. diagramsActivity diagrams
Review No.CodeNon commented SLOC
AuthorCode - CommentsLines of comments
ChecklistTest planTest cases
Review Date
ObjectiveCompleteness
Correctness
Compliance to standardsPlan Review Effort
(Other)Plan Rework Effort
Issue SeveritySub TotalOpenClosedDeferred
Major0000
Minor0000
Trivial0000
Total0000
Major0000
Minor0000
Trivial0000
Review Effort0.00 PHResolution
Defect Closure EffortReference documentRevn. No.
Verification Effort
Total Effort0.00 PH
Fix verified
Verified By
Verification Date
Sign-off By
Sign-off Date
&L&A&C&"Arial,Bold"&16Offline Defect Log
&L&8PSPF018_21 / &F&C&8HCLT Confidential&R&8Sheet &P of &N
Defect log
StatusPhase introducedCause of defectValiditySeverityTypeClass
OpenPlanningCommunicationValidMajorCodingDefect
ClosedRequirementsEnvironment setupInvalidMinorDataClarify
DeferredHigh level designInadequate standardsTrivialDocumentationInvestigate
Low level designInadequate training / educationEnvironmentalSuggestion
CodeProcessFunctional
TestRequirements changed / not clearInterface
Performance
Standards
S. No.File / Work productIssue LocationDefect DescriptionValiditySeverityPhase IntroducedType of issueClassCause of DefectStatusComments
&L&A&C&"Arial,Bold"&16Offline Defect Log
&L&8O_PSPF018_20 / &F&C&8HCLT Confidential&R&8Sheet &P of &N
_1251800589.pdf
XML Guidelines
Version No. : 2.0
Date: 30-Dec-2005
Corporate Quality Team HCL Technologies Ltd
PM Towers 37, Greams Road
Chennai 600 006.
Copyright Notice
This document contains proprietary information of HCL Technologies Ltd. No part of this document may be reproduced, stored, copied, or transmitted in any form or by means of electronic, mechanical, photocopying or otherwise, without the express consent of HCL Technologies. This document is intended for internal circulation only and not meant for external distribution.
ENGG010_20 XML Guidelines
HCLT Confidential Page 2 of 40
Revision History
Version No. Date Prepared by / Modified by Significant Changes
2.0 30-Dec-2005 CTWG HCL Logo modified.
1.1 07-Jul-2003 Corporate Quality Modified the address of Corporate Quality Team in the first page.
1.0 15-May-2003 Engg. TWG OMS Phase 2 release.
Glossary
Abbreviation Description
HCLT HCL Technologies Ltd.
VER Version
DTD Document Type Definition
XML Extensible Markup Language
XSL Extensible Style Sheet Language
XSLT Extensible Style Sheet Language Template
SAX Simple API for XML
DOM Document Object Model
ENGG010_20 XML Guidelines
HCLT Confidential Page 3 of 40
Table Of Contents
1 Introduction.......................................................................................................................................5
1.1 Purpose ....................................................................................................................................5
1.2 Scope .......................................................................................................................................5
1.3 Intended Audience ...................................................................................................................5
2 Data Description Technology..........................................................................................................5
2.1 XML DTD..................................................................................................................................5
2.1.1 Attributes vs Elements................................................................................................5
2.1.2 Miscellaneous.............................................................................................................6
2.2 XML Schema............................................................................................................................9
2.2.1 Naming Convention....................................................................................................9
2.2.2 Declaration .................................................................................................................9
2.2.3 Namespaces ............................................................................................................10
2.2.4 Versioning Schemas ................................................................................................10
2.2.5 Miscellaneous...........................................................................................................10
3 Processor Technologies Parsers Types ...................................................................................11
3.1 DOM .......................................................................................................................................11
3.1.1 Guidelines.................................................................................................................12
3.2 SAX ........................................................................................................................................12
3.2.1 Guidelines.................................................................................................................12
4 Processor Technology - Framework ............................................................................................13
4.1 .NET .......................................................................................................................................13
4.1.1 GuideLines ...............................................................................................................14
4.2 Java........................................................................................................................................18
4.2.1 Parsers .....................................................................................................................18
4.2.2 API............................................................................................................................18
ENGG010_20 XML Guidelines
HCLT Confidential Page 4 of 40
4.2.3 Serialization..............................................................................................................20
4.2.4 Access Methods .......................................................................................................20
5 Transformation Technologies .......................................................................................................21
5.1 GuideLines .............................................................................................................................21
5.2 XPath......................................................................................................................................30
5.2.1 Guide Lines ..............................................................................................................30
6 Linking technologies......................................................................................................................33
6.1 XPointer..................................................................................................................................33
6.1.1 Guidelines.................................................................................................................33
7 XML Database .................................................................................................................................35
7.1 Data versus Documents.........................................................................................................35
7.1.1 GuideLines ...............................................................................................................35
7.1.2 Example....................................................................................................................35
7.2 Mapping Document Schemas to Database Schemas ...........................................................37
7.2.1 GuideLines ...............................................................................................................37
7.3 Query Languages...................................................................................................................37
7.3.1 GuideLines ...............................................................................................................37
7.4 Native XML Database ............................................................................................................37
7.4.1 GuideLines ...............................................................................................................37
7.5 Storing data from XML documents in traditional databases ..................................................38
7.5.1 GuideLines ...............................................................................................................38
ENGG010_20 XML Guidelines
HCLT Confidential Page 5 of 40
1 Introduction
1.1 Purpose
The purpose of programming standards is to support the development of applications that are consistent and well written. Standards and guidelines help developers to create a code base with a uniform presentation, which leads to code that is easy to understand, easy for other developers to use and easy to maintain. Standards and guidelines also help developers to avoid the common pitfalls of XML that leads to code that is robust, consistent, reliable and portable.
1.2 Scope
Implementation standards for XML are essential and should be adopted to achieve the following goals.
Facilitate joint development
Avoid common pitfalls
Maintainability Reliability
Understandability
Seamless Coexistence of modules
1.3 Intended Audience
Developers and Reviewers in xml.
2 Data Description Technology
2.1 XML DTD
2.1.1 Attributes vs Elements
Guidelines
Put metadata into attribute, put content into element (one way to distinguish between metadata and content is to ask question: "if I remove this data/information, would my
ENGG010_20 XML Guidelines
HCLT Confidential Page 6 of 40
understanding of the content change; if answer is "no", this is rather metadata, i.e. attribute or descriptive information)
Attributes are more suitable for enumerated data. Attributes can be used for computer-manipulated values.
Entities (nodes) are expressed as elements
Properties (edges) and relations are expressed as attributes
Attributes are atomic characteristics of an element/object that have no identity of itself, their meaning may change on element described
2.1.2 Miscellaneous
Guidelines
TagNames
Use whole english words for TagNames(e.g. instead of ). It allows for automated translation (XML is unicode compliant)
Using Container Elements
If two or more different types of elements can appear at the same level in a tree, create 'container' elements
Without this rule, it is common to see documents like:
By following the rule, the document instead looks like:
ENGG010_20 XML Guidelines
HCLT Confidential Page 7 of 40
Avoid 'mixed' content.
If you want mixed content (elements and text), the DTD must be written as:
This DTD is ambiguous. It says that can contain many elements. Also, can have multiple blocks of content:
hi bye goodnight
'mixed' is useful for marking up content and works well with free-form text. However, unless this is exactly what you want, you should not define your document so liberally. Create a new node that will be defined as #PCDATA:
Plan for DTD maintenance
ENGG010_20 XML Guidelines
HCLT Confidential Page 8 of 40
DTDs can be changed once they have been published, as long as certain guidelines are followed:
1. Elements cannot be removed
2. Attributes cannot be removed
3. Attributes cannot be changed from "implied" to "required"
4. Default values should not be modified (generally)
5. A "value" cannot be removed from an attribute "value list"
6. The required structure of a document cannot be changed. For example,? cannot become + and a new element cannot be required to appear inside an existing element. Only? And *can be used when changing the document structure.
7. #PCDATA can't be removed from an element
If these guidelines can't be followed, a new type of document must be created.
Another way to manage change is to plan for it. For example, a top-level element could have a "version" attribute. A document that conforms to the first version of the DTD would have version="1" and a document that conforms to the second version of the DTD would have version="2". The version number would only change if the DTD were modified in such a way that violated the above guidelines. However, without diligent coding, this method will fail. Any code "forgets" to check the version will break when a new version is introduced to the system.
Use entities to encapsulate repetition
As an example, a traveler uses a vehicle to get to his destination:
The DTD fragment for this is:
The VEHICLE entity is handy because it can be reused. It also makes the DTD easier to maintain. For example, if you add another type of vehicle, only the entity needs to be changed. The rest of the DTD is unaffected.
By the way, XML Schemas address this need via two mechanisms: substitution groups and inheritance.
ENGG010_20 XML Guidelines
HCLT Confidential Page 9 of 40
2.2 XML Schema
2.2.1 Naming Convention
Guidelines
Use Upper Carmel Case with no spaces or hyphens between words for all XML element names.
Use Lower Carmel Case with no spaces or hyphens between words for all XML attribute names.
Enumeration values should use names only (not numbers) and the names used for enumeration values must conform to the guidelines for element or attribute names.
Example
Examples of UCC camel case names are: PublisherName or TransactionSequenceNumber.
Examples of LCC camel case names are: attributeTypeId or processId.
2.2.2 Declaration
Guidelines
Type Definition If a type definition is likely to be reused, either a simpleType or a complexType should be defined globally in the namespace instead of defining the type anonymously in the Element declaration.
Global And Local Element Declarations
Global element declarations should be used for elements that will be reused from the target schema as well as from other schema documents. Local elements are to be favored when element declarations only make sense in the context of the declaring type and are unlikely to be reused.
Global And Local Attribute Declarations
Global attribute declarations should be used for types that will be reused from the target schema as well as from other schema documents. Local attributes should be used when attribute declarations only make sense in the context of the declaring type and are unlike to be reused. Since attributes are usually tightly coupled to their parent elements, local attribute declarations are typically favored by schema authors.
Nested Elements
ENGG010_20 XML Guidelines
HCLT Confidential Page 10 of 40
Schemas should use nested elements that use the type attribute or an inline type definition (simpleType or complexType) instead of the ref attribute that references a global element
2.2.3 Namespaces
Guidelines
Default Namespace - targetNamespace or XMLSchema
Make the targetNamespace the default namespace, and explicitly qualify all components from the XMLSchema namespace.
Namespace design When managing multiple schemas, three design patterns apply.
Heterogeneous Namespace Design: Give each schema a different targetNamespace
Homogeneous Namespace Design: Give all schemas the same targetNamespace
Chameleon Namespace Design: Give the main schema a targetNamespace and give no targetNamespace to the supporting schemas (the no-namespace supporting schemas will take-on the targetNamespace of the main schema, just like a Chameleon)
When reusing schemas that someone else created you should those schemas,
(i.e.) use the Heterogeneous Namespace design.
When schemas which contain components that have semantics only in the context of an ing schema, use Chameleon Namespace Design. As a rule of thumb, if your schema just contains type definitions (no element declarations) then that schema is probably a good candidate for being a Chameleon schema. Avoid using Chameleon schemas.
When all of your schemas are conceptually related, use Homogeneous Namespace Design. Use URLs in preference to URNs as Namespace names.
2.2.4 Versioning Schemas
Guidelines
Change the (internal) schema version attribute OR.
Create a schema Version attribute on the root element.
2.2.5 Miscellaneous
Guidelines
ENGG010_20 XML Guidelines
HCLT Confidential Page 11 of 40
All Schemas must have a Consistent Value for elementFormDefault, when including / importing multiple schemas.
Use elementFormDefault=qualified to expose namespaces. When there are multiple elements with the same name but different semantics then you may want to namespace-qualify them so that they can be differentiated.
Use elementFormDefault=unqualified to hide namespaces from instance documents, when simplicity, readability, and understandability of instance documents is of utmost importance
Note that when elementFormDefault=qualified, then every element in the XPath expression should also be qualified.
Use model groups when requirement is just to be able to use a named group of elements. Complex Types to be used when attributes need to be included in the contentModel or type derivation is important.
Avoid using Substitution Groups. Use instead.
Avoid using Notation Declarations. Notations in W3C XML Schema are not compatible with notations in DTDs, because a Schema notation is a QName.
Use extension of Complex Type only if type-aware query languages like XQuery or XSLT 2.0 are able to process the elements and attributes polymorphically.
Avoid using restriction of Complex types.
Use Abstract Types with care. Ensure that further extension or restriction on the type has been applied.
Avoid using type redefinition. A major problem with type redefinition is that unlike type derivation, using the block or final attributes cannot prevent it. Thus any schema can have its types redefined in a pervasive manner, thus altering their semantics completely.
3 Processor Technologies Parsers Types
3.1 DOM
The Document Object Model (DOM) is a platform- and language-neutral interface that permits script to access and update the content, structure, and style of a document. The DOM includes a model for how a standard set of objects representing HTML and Extensible Markup Language (XML) documents are combined, and an interface for accessing and manipulating them. The key advantages of the DOM are the abilities to access everything in the document, to make numerous content updates, and to work with content in separate document fragments. Working together with the Dynamic HTML (DHTML) Object Model available as of Internet Explorer 4.0, the DOM enhances a Web author's ability to build and manage complex documents and data. Many tasks, such as moving an object from one part of the document to another, are highly efficient and easy to perform using DOM members.
ENGG010_20 XML Guidelines
HCLT Confidential Page 12 of 40
3.1.1 Guidelines
Usage of DOM for parsing XML documents is recommended in the following situations:
Performing XSLT transformations
The DOM works better for XSL Transformations (XSLT) where the source XML document is transformed based on the XSLT template applied. To create multiple views of the same data, one must transform it using one of two style sheets. In order for this transformation to take place, two instances of the DOM need to be created. One stores the XML source; the other stores the transformed content.
Complex XPath filtering is required
DOM should be used if one must perform complex XML Path Language (XPath) filtering and retain complex data structures that hold context information. The tree structure of the DOM retains context information automatically. With SAX, one must specifically take care of retaining context information.
Modify and save XML files
The DOM allows creation or modification of a document in memory, as well as read a document from an XML source file. SAX is designed for reading, not writing, XML documents. The DOM is the better choice for modifying an XML document and saving the changed document to memory.
Random access to data
If random access to information is crucial, it is better to use the DOM to create a tree structure for the data in memory.
3.2 SAX
SAX offers a lightweight alternative to DOM. It facilitates searching through a huge XML document to extract small pieces of informationand it allows premature aborting when the desired piece of information is found. SAX was designed for any task where the overhead of the DOM is too expensive.
However, the performance benefits of SAX come at a price. In many situations such as advanced queries, SAX becomes quite burdensome because of the complexities involved in managing context while processing. When this is the case, most developers either turn back to the DOM or some combination of SAX and DOM together.
3.2.1 Guidelines
Usage of SAX for parsing XML documents is recommended in the following situations:
Large XML documents
ENGG010_20 XML Guidelines
HCLT Confidential Page 13 of 40
The biggest advantage of SAX is that it requires significantly less memory to process an XML document than the DOM. With SAX, memory consumption does not increase with the size of the file. For example, a 100-kilobyte (KB) document can occupy up to 1 megabyte (MB) of memory using the DOM; the same document requires significantly less memory-using SAX. If one needs to process large documents, SAX is the better alternative, particularly if changing the contents of the document is not required.
Abort parsing
Because SAX allows aborting of processing at any time, it can be used to create applications that fetch particular data. For example, one can create an application that searches for a part in inventory. When the application finds the part, it returns the part number and availability, and then stops processing.
Retrieving small amounts of information
For many XML-based solutions, it is not necessary to read the entire document to achieve the desired results. For example, if one wants to scan data for relevant news about a particular stock, it's inefficient to read the unnecessary data into memory. With SAX, the application can scan the data for news related only to the given stock symbols, and then create a slimmed-down document structure to pass along to a news service. Scanning only a small percentage of the document results in a significant savings in system resources.
Creating a new document structure In some cases, one might want to use SAX to create a data structure using only high-level objects, such as stock symbols and news, and then combine the data from this XML file with other news sources. Rather than build a DOM structure with low-level elements, attributes, and processing instructions, one can build the document structure more efficiently and quickly using SAX.
DOM overhead is not affordable For large documents and for large numbers of documents, SAX provides a more efficient method for parsing XML data. For example, consider a remote procedure call (RPC) that returns 10 MB of data to a middle-tier server to be passed to a client. Using SAX, the data can be processed using a small input buffer, a small work buffer, and a small output buffer. Using the DOM, the data structure is constructed in memory, requiring a 10 MB work buffer and at least a 10 MB output buffer for the formatted XML data.
4 Processor Technology - Framework
4.1 .NET
In .NET, XmlNode-based trees are built via underlying XmlReader streams. These trees, however, remain in memory until the client is finished with them. DOM trees are completely
ENGG010_20 XML Guidelines
HCLT Confidential Page 14 of 40
dynamic and can be traversed in a variety of ways. Most developers working with XML are familiar with the DOM API since it has been around the longest. The DOM is considered the "easiest to use" by many developers but that simplicity comes with a cost.
4.1.1 GuideLines
XMLReader GuideLines
If the data is in XML 1.0 format, there is no choice but to use XmlTextReader to process the byte stream. XmlReader is faster than DOM. The methods used are ReadXmlSchema(), Load(), & LoadXml().
ENGG010_20 XML Guidelines
HCLT Confidential Page 15 of 40
Writing XML Documents
For generating XML documents there are two API choices: XmlTextWriter and the DOM.
Unlike XmlTextReader, XmlTextWriter is actually more intuitive for many developers than the DOM, especially for those used to working with SAX. Using XmlTextWriter to generate a document feels a lot like the SAX ContentHandler interface since a sequence of method calls represents a logical XML document. As with SAX, the key benefit to XmlTextWriter is that the resulting document doesn't need to be buffered in memory. Instead it can be written directly into the output stream as the document is generated. This makes XmlTextWriter much more efficient than the DOM and is quite easy to use.
Choices Pros Cons
XmlTextWriter -Fastest -Most efficient, not buffered -Familiar to SAX developers
-You can't manipulate the document (forward-only stream)
DOM -More flexibility in-memory-Familiar to DOM developers -Slower and less efficient since buffered in-memory
Load Data into Memory
There are only two in-memory structures available in .NET, XmlDocument (the standard DOM implementation) and XPathDocument (a tree optimized for XPath/XSLT). Deciding between these mostly depends on performance, programming ease, and future extensibility as illustrated by this decision tree.
ENGG010_20 XML Guidelines
HCLT Confidential Page 16 of 40
In terms of performance, XPathDocument offers XPath and XSLT optimizations. XPathDocument implements the XPathNavigator interface over its more efficient tree structure. XmlDocument, however, is typically easier to use since it implements both the standard DOM interfaces as well as XPathNavigator.
Also, writing code via XPathNavigator is generally good to take advantage of future extensions. Since XPathNavigator is much easier to implement than the DOM API, there are more likely to be new-and-improved custom implementations down the road.
With XpathDocuments, load performance does not scale well. Use XML readers/writers, not an XmlDocument, for simple manipulations of very large
documents.
Unnecessary use of CDATA sections will increase memory footprint.
Use XmlNameTables when doing extensive comparisons.
Use the XmlDataDocument for XML/DataSet integration. If source of xml is known don't use validation reader unless normal parsing fails.
Choices Pros Cons
XmlTextReader -Fastest -Most efficient (memory)-Extensible
-Forward-only -Read-only -Requires manual validation
ENGG010_20 XML Guidelines
HCLT Confidential Page 17 of 40
Choices Pros Cons
XmlValidatingReader
-Automatic validation-Run-time type info-Relatively fast & efficient(compared to DOM)
-2 to 3x slower than XmlTextReader -Forward-only -Read-only
XmlDocument (DOM) -Full traversal -Read/write -XPath expressions
-2 to 3x slower than XmlTextReader/XmlValidatingReader -More overhead than XmlTextReader/XmlValidatingReader
XPathNavigator
-Full traversal-XPath expressions-XSLT integration-Extensible
-Read-only -Not as familiar as DOM
XPathDocument -Faster than XmlDocument -Optimized for XPath/XSLT -Slower than XmlTextReader
ENGG010_20 XML Guidelines
HCLT Confidential Page 18 of 40
4.2 Java
4.2.1 Parsers
Guidelines
The following are some of the parsers available for parsing of XML files:
1. Crimson
2. Xerces
Among the two, Crimson performs better. Crimson is a straightforward implementation of an XML parser and has a small footprint: around 200KB (jar file size) while Xerces is more sophisticated and includes many additional features like XML Schema support. Xerces also comes with support for WML and HTML DOMs, which significantly increase the size of the jar file to 1.5MB. The Apache organization is in the process of re-factoring Xerces and addressing performance in Xerces2.
4.2.2 API
Guidelines
Among several existing Java API for processing XML files, JDOM seems to be the best. This API supports for both DOM and SAX methods. JDOM has been optimized for Java and moreover, by the use of the Java Collection API, it has been made straightforward for the Java developer. JDOM documents can be built directly from, and converted to, SAX events and DOM trees, allowing JDOM to be seamlessly integrated in XML processing pipelines and in particular as the source or result of XSLT transformations.
Use the interface, not the implementation
In the Interface-based models like the W3C's DOM, each interface has an implementation in the vendor's release. For example, the DOM interface org.w3c.dom.Document is implemented by the Xerces class org.apache.xerces.dom.DocumentImpl. However, all too often, we would see a code like:
DocumentImpl doc = new org.apache.xerces.dom.Doc