The Future of XML at the IRS and Building Partnerships for Collaboration … · 2009. 9. 25. · 1...

15
1 The Future of XML at the IRS and Building Partnerships for Collaboration Dynamic Data @ The Internal Revenue Service John A. Triplett [email protected] FTA Technology Conference August 8 – 10, 2005 Sol Safran [email protected] Agenda IRS Enterprise Architecture IRS XML Transition Strategy IRS XML Framework XML Registry XML Components Data Exchanges

Transcript of The Future of XML at the IRS and Building Partnerships for Collaboration … · 2009. 9. 25. · 1...

  • 1

    The Future of XML at the IRSand Building Partnerships for

    Collaboration Dynamic Data

    @ The Internal Revenue Service

    John A. [email protected]

    FTA Technology ConferenceAugust 8 – 10, 2005

    Sol [email protected]

    Agenda

    • IRS Enterprise Architecture• IRS XML Transition Strategy• IRS XML Framework• XML Registry• XML Components• Data Exchanges

  • 2

    Much has beenaccomplished...

    • The IRS has deployed several high impact applications that makelife easier for American Taxpayers– “Where’s My Refund” Internet Refund, Fact of Filing (IRFOF)– CADE version 1

    • 1040ez filers moved to new system– e-Services

    • Self service applications for practitioners– Modernized e-File

    • Electronic filing of business returns• Web Services interface being introduced

    • Use of the Internet to enable a new value proposition for taxpayersand software developers and meet rising expectations

    …Much Remains To BeDone

    • We need to…– Shorten our development cycles– Reducing our cycle times– Simplify our infrastructure and the relationships between

    systems– Modernize our data strategies

    • So we can…– Enhance self service opportunities– Enable taxpayer account access– Publish XML based Application Programming Interfaces– Retire the Master File

  • 3

    It’s all About the Data

    • The lack of ability to move data hobbles anorganization’s flexibility– It’s not lack of data, but lack of ability to get the data

    where it’s needed– Data needs to be moved and combined to turn it into

    information and get it to authorized consumers• The ultimate goal:

    – The “zero latency enterprise”• Latency is our enemy

    – Move from weekly cycles to daily, then instantaneoussettlement

    The IRS Service BasedArchitecture (IRS/SBA)

    • The IRS/SBA is the IRS’ implementation of a Service OrientedArchitecture

    • Loosely coupled.• Places XML translators in front on existing IRS systems

    – Provides quicker time to market as software resources arereused

    – Shortens maintenance cycles by using XML as the wire andinterface formats

    – XML will protect software developer investment• Event driven

    – Publish and subscribe• Provides data convergence

    – Data from multiple sources is combined on the fly to turn it intoinformation

  • 4

    IRS/SBA

    Web ServicesInterface

    IntegrationBroker

    Adapters Applications

    Native formatsXML

    Orchestration

    IRS/SBA

    Web ServicesInterface

    IntegrationBroker

    IRS ELDM

    Schema

    Doc Types

    ServiceInput/OutputDocTypes – (IRS XML XSD)

    Flow ServicesBusiness Processes

    Repository

    Simple & Complex Elements(common components)

    Ex: TaxpayerAddressCreate,TaxpayerAddressUpdate, etc.

  • 5

    Benefits of the ServiceBased Architecture

    • Services allow developers to focus on business logic and user friendlydisplays instead of where to get the data from and how to get it

    • Hides the complexity of the infrastructure– One entry point into the IRS– External interfaces governed by standards– Each application or service has exactly one interface to everything else

    • Serves up authoritative information– Developers deal with “business information” instead of “data”

    • Reliable service delivery– Stable interfaces through XML– Faster time to market – less code to write– Each service can be upgraded and scaled independently (loose

    coupling)

    Technology and Standards

    Majority of IRS data centric systems andinfrastructure components have not

    transitioned to the Modernized Architecture

    Initial Modernized Data replicated in LegacyFormat (e.g., CADE => IMF)

    Majority of IRS systems and infrastructurecomponents have transitioned to the

    Modernized Architecture

    Remaining Legacy Data replicated inModernized Format

    IRS

    Inte

    rnal

    Dat

    a A

    rchi

    tect

    ure

    Transition Strategyto Modernized Data Stores

    Lega

    cyD

    ata

    Stor

    esM

    oder

    nize

    d D

    ata

    Stor

    es

  • 6

    IRS External Data Exchange DriversTechnology and Standards

    Modernized DataExchange

    XML Standards andTechnology stable(though still evolving)

    And

    Tax Administration sectoradoption of Standardsand CommonVocabularies mature (andgrowing)

    Legacy DataExchanges

    XML Standards andTechnology maturing

    And/or

    Tax Administration sectoradoption of Standards andCommon Vocabularies notyet mature

    Transition Strategy to XML DataExchange

    IRS External Data Exchange

    Use Adaptors totranslate Legacy

    formats toModernized format

    Current State

    Target State

    Use Adaptors totranslate from

    Modernized format toLegacy.

    IRS

    Inte

    rnal

    Dat

    a A

    rchi

    tect

    ure

    IRS Transition Strategy

    Optimal Path

    Outpace Standards

    Trail Standards

    Where we areToday

    Legacy Data Modernized Data

    Lega

    cy F

    orm

    atM

    oder

    nize

    d Fo

    rmat

  • 7

    What This Means

    • The vertical line, IRS modernization progression,is a driver for how the IRS manages data internally

    • The horizontal line, maturity of XML technologyand standards in the industry, is a driver for howthe IRS will exchange data with our externalpartners

    • This is a principle, some data exchanges maytransition prior to others. Prototypes/pilots will bestrategies for discussion.

    Overview – IRS XML Framework

    • Objective:– Identify the target state of XML technology implementation for the IRS– Assess and adopt appropriate standards, practices, and tools for the IRS and its trading

    partners– Strategically transition current systems and legacy formats to modernized XML data exchange

    formats

    Adapted from a Department of Education, Federal Student Aid briefing. “XML: A Beginners Guide”Presented at the 2003 Electronic Access Conference in San Diego, CA.

  • 8

    Current IRS XML Standardsand Governance Initiatives

    • IRS XML Community of Practice (Chartered)– Organize internal activity– Collaborate on adoption of industry standards– Form and vet IRS responses to external communities

    of practice• Naming and Design Rules (NDR)

    – Draft UBL based NDR circulating for comment• XML Registry, Repository and Registration process

    – Concept of Operations and alternatives analysis• Common component schema

    – Building blocks for composing IRS schema– Integral to IRS/SBA– Tax Filing vocabulary derived from existing MeF

    schema

    XML Registry Concept

  • 9

    Common Component Building Blocks

    TheCommonComponentApproach isin a designphase. Thisis oneconcept,borrowedfrom FederalStudent Aid.

    IRS NDR, CommonComponents, and MeF

    • IRS NDR is in a comment and review period• Federal NDR is in an earlier comment phase• UBL 2.0 is close to draft release• IRS NDR will align with Federal NDR and UBL 2.0• Approach to MeF

    – Capture vocabulary into common component schemafrom existing MeF schema

    – Prototype NDR conformant schema for MeF– Assess impact and feedback to Federal WG and UBL TC– Publish common component schema for Tax Filing– Target MeF Release 5.0 for NDR conformance

  • 10

    Some xmlCoP &Working Group Issues

    • XML Performance and Size• Standards Interoperability• Naming & Design Rules

    – Schema Versioning and Governance• XML Registry• Transition• XBRL• Many Others!

    IRS Partners for XML Policyand Standards

    • Federal CIO’s Council Working Groups– XML Community of Practice (xmlCoP)– Semantic Interoperability CoP

    • Federation of Tax Administrators (FTA)– TAG– TIGERS

    • OASIS– Tax XML TC– UBL TC

    • Universal Business Language (UBL 1.0)– ebXML Registry TC

  • 11

    Data Exchange Scope

    Current IRS/State Exchanges

    • No Format Changes Pending!– There are no immediate plans to change the

    format of existing files.• The future state and how we get there will be

    designed with– close communication and– collaboration between the IRS and our

    external partners.• New exchanges will be standards driven.

  • 12

    Beyond MeF

    • The IRS has worked closely with the TIGERS on MeF.• MeF is just one of many taxpayer data exchanges. As

    a new system, it was one of the first to implement XMLon a large scale.

    • As legacy systems transition and new data exchangesare created, the IRS will continue to partner with theFTA and TIGERS to plan and manage this change. Forexample: E-Services, GLDEP & EODAD.

    • XML vocabularies, policy and standards will beharmonized with Federal, state and local, industry andInternational communities.

    Contact

    John Triplett Enterprise Data Management Office (EDMO)[email protected]

    Doug Peterson Electronic Tax Administration (ETA)[email protected]

    Sol Safran Prime Enterprise Data Management (EDM)[email protected]

  • 13

    Back-Up Slides

    • Some slides with examples of issues underdiscussion

    IRS Governmental Liaison Data ExchangeProgram (GLDEP)

    • Data Extracts from various IRS master files anddatabases

    • Shared with state and local tax administrations• Currently 15 GLDEP data extracts• Agencies enroll annually and select which

    extracts they wish to receive

  • 14

    Current Extracts

    • 1099 Misc• BMF/BRTF• Corporate Affiliations• CP 2000• Exam/Appeals• FEIN (Federal Employer

    ID Number)• IMF/IRTF

    • IRMF• Levy• Military Combat Zone• Non-Itemizer• PTIN (Preparer Tax ID

    Number)• TAR (Taxpayer Address

    Request)

    Examination OperationalAutomation Database

    • EOAD is a relational database using different“files” within one database for different purposes

    • Data is separated first by type of return (Form1040, 1120, 1065, and 1120S) and then bytype/purpose of records (Entity, Issue, Other,and Partners)

  • 15

    Compressibility of XML

    txt

    XML

    WinZip

    674,062 bytes

    Ref: Dan WinkowskiMITRE XML Intro for Managers

    11,421,822 bytes

    148,294 bytes 94,369 bytes

    The compressed version of the XML document is smaller than the compressed version of the original document!

    translate