Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

29

Transcript of Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Page 1: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.
Page 2: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Open XML FormatsOpen XML Formats

Jessica GruberConsultantMicrosoft Corporation

Page 3: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

AgendaAgenda

Introduction to the new formats Introduction to the new formats (demos)(demos)

Developer view of the Open XML Developer view of the Open XML FormatsFormats

Tools for working with Open XML Tools for working with Open XML

Files Files (demos)(demos)

Server development scenarios Server development scenarios (demos)(demos)

Question and AnswerQuestion and Answer

Page 4: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Session GoalsSession Goals

Understand the benefits of Office Open XML solutions for developers

Introduction to Microsoft tools for Open XML development

Learn basic developer patterns forWorking with Packages and Parts (Zip)

Navigating through relationships (XML)

Manipulating data within parts (XML)

Kick off the conversation on OpenXMLDeveloper.org

Page 5: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Open XML FormatsOpen XML Formats

New XML file formats for Word, Excel and PowerPoint

New formats will be the default file formats, with new file type extensions (.docx; .pptx; .xlsx)Fully 100% compatible with existing formats

Open, transparent format improves interoperability

XML - Transparent, XML format enables new integration scenarios for documents and LOB systemsZIP container - allows for standard compression on all files without user effortLicensing - Removed need for license by providing a Covenant that says we won’t enforce IP against folks implementing the format (100% royalty free)

StandardizationEcma International - created TC45 to fully document the Open XML formats

Members include: Apple, Barclays Capital, BP, the British Library, Essilor, Intel Corporation, NextPage Inc., Statoil ASA and Toshiba Current spec is already over 2000 pages

Page 6: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Open XML Formats (cont’d)Open XML Formats (cont’d)

Added Benefits: compact and robustZIP container allows for standard compression on all files without user effort (Dramatic file size improvements)Significantly more robust files to help minimize data loss

Backward Compatible: Office 2000, Office XP, Office 2003 will all support the new formats

Patches for compatibility available by launchOpen, edit and save new formats

Legacy support: Current Office 97-2003 binary file formats supported

Support for XML formats from Office 2003, Office XP continued

Developers: Endless potential for developers Build solutions to read, write, and modify Office files (without the need to run Office APIs)

Page 7: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Evolution Of File FormatsEvolution Of File Formats

Office 2000Office 2000Early InnovationEarly Innovation

XML document XML document propertiesproperties

Office 97Office 97Existing binary file formats Existing binary file formats designed in 1994, launched designed in 1994, launched in Office 97in Office 97

Office XPOffice XPFirst XML First XML FormatFormat

Spreadsheet Spreadsheet XMLXML

Office 2003Office 2003Breakthrough XML Breakthrough XML SupportSupport

WordML, SpreadsheetMLWordML, SpreadsheetML

Custom-defined schemaCustom-defined schema

Office 2007Office 2007New XML FormatsNew XML Formats

XML file format defaultXML file format default

XML PowerPoint formatXML PowerPoint format

““Wave 12”Wave 12”

Page 8: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Components of the new Components of the new formatsformatsPackage – ZIP ContainerPart – The “files” inside the ZIPContent Types – Each part has a content type that is enforced on openRelationships – Any part that references another part must do so via a relationship

Document Properties

Application Properties

Custom Doc. Props.

Workbook

Sheet 2

Sheet 3

Sheet 1 Styles

Chart

Strings

Relationship

...

...

Page 9: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Introduction to the new Introduction to the new File FormatFile Format

Page 10: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Document AnalyzerDocument Analyzer

Office Open XML FormatsOffice Open XML Formats

Page 11: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

User view: single Office User view: single Office “file”“file”

Questionaire.docxFile ContainerFile Container

Document PropertiesDocument Properties

CommentsComments

ChartsCharts

Embedded code / macrosEmbedded code / macros

Images, video, soundImages, video, sound

Custom-defined XMLCustom-defined XML

WordML / SpreadsheetML, etc.WordML / SpreadsheetML, etc.Document PartsDocument Parts

Most parts are XMLMost parts are XML

Each XML part is a discreet, Each XML part is a discreet, compressed componentcompressed component

Can add, extract and modify individual Can add, extract and modify individual parts using any Zip implementationparts using any Zip implementation

Corruption or absence of any part Corruption or absence of any part would not prohibit the file from being would not prohibit the file from being openedopened

Developer view: modular Developer view: modular filefile

Open XML Formats Open XML Formats ArchitectureArchitecture

Page 12: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Benefits Of Open XML Benefits Of Open XML SolutionsSolutions

No longer need to automate client applications to open and modify files

Unsupported solution on serverClients not designed for this scenario

ReliabilityAccess directly the parts you needAvoid corruption

TransparencyDirect access to your data!

Page 13: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Tools for Accessing Data Tools for Accessing Data In Office Open XML filesIn Office Open XML files

XML EditingNotepad?System.XML makes this easier

ZIP ManipulationCompressed Folders in Windows? Third-Party Zip LibrariesMicrosoft’s Packaging API’s

Office Open XML Resource KitCode Snippets - Beta 2

C# and VB.NET

Validation LibraryParses a file and reports on schema, relationship errors and warnings

Schemasalready in the ECMA document

Page 14: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

System.IO.PackagingSystem.IO.Packaging

Part of Windows Presentation Foundations (WinFX)

Ships with Vista Betas available now

Requires .NET 2.0

Enables package manipulation forOffice Open XML File FormatsXML Paper Specification FilesAny Open Packaging Convention files

Page 15: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

PackagePackagePackagePartPackagePartPackagePartCollectionPackagePartCollectionPackageRelationshipPackageRelationshipPackageRelationshipCollectionPackageRelationshipCollectionPackUriHelperPackUriHelper

System.IO.Packaging System.IO.Packaging API functionalityAPI functionality

Create/Open packagesCreate and delete parts and relationshipsRead and write part streamsIterate through collections of parts and relationships

Page 16: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Reading Data From Reading Data From FilesFiles

Microsoft Office PowerPoint 2007Microsoft Office PowerPoint 2007

Page 17: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

System.IO.Packaging.System.IO.Packaging.PackagPackagee

Package class provides methods to create, enumerate and delete the following entities

Package

Package Relationships

PackageProperties

Parts

Pack

ag

e R

ela

tionsh

ips

Pack

ag

e R

ela

tionsh

ips

Core PropertiesCore Properties

Common Package Common Package PartsParts

ThumbnailThumbnail

Digital SignaturesDigital Signatures

officeDocumentofficeDocument

XML PartXML Part

XML PartXML Part

Specific Format Specific Format PartsParts

Etc…Etc…

Part

Rels

Part

Rels

XML PartXML Part

Part

Rels

Part

Rels

Page 18: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

System.IO.Packaging.System.IO.Packaging.RelationsRelationshiphip

Relationships tie the parts together

Required to find parts (part names are not guaranteed)

Iterate through RelationshipCollection by Type or ID

Relationship PropertiesIDPackageRelationshipTypeSourceUriTargetModeTargetUri

Pack

ag

ePack

ag

e R

ela

tionsh

ips

Rela

tionsh

ips

Core Properties

Common Package Parts

Thumbnail

Digital Signatures

officeDocument

XML Part

XML Part

Specific Format Parts

Etc…Etc…

Part

Rels

Part

Rels

XML Part

Part

Rels

Part

Rels

officeDocumentofficeDocument

XML PartXML Part

XML PartXML Part

XML PartXML Part

Page 19: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

System.IO.Packaging.System.IO.Packaging.PackagePPackagePartart

Parts are the objects of data within the Package

PackagePart provides support to create, enumerate and delete part relationships

Get Part data as Stream

PackagePart Properties:

CompressionOption

ContentType

Package

Uri Pack

ag

e R

ela

tionsh

ips

Pack

ag

e R

ela

tionsh

ips

Core Properties

Common Package Parts

Thumbnail

Digital Signatures

officeDocumentofficeDocument

XML Part

XML PartXML Part

Specific Format Specific Format PartsParts

Etc…Etc…

Part

Rels

Part

Rels

XML PartXML Part

Part

Rels

Part

Rels<w:body>- <w:p w:rsidR="001B7EF4" w:rsidRDefault="001B7EF4">- <w:r>  <w:t>The Quick Brown Fox jumped over the river.</w:t>   </w:r>  </w:p>…

XML PartXML Part

<w:body>- <w:p w:rsidR="001B7EF4" w:rsidRDefault="001B7EF4">- <w:r>  <w:t>The Cow jumped over the moon.</w:t>   </w:r>  </w:p>…

Page 20: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

System.IO.Packaging.System.IO.Packaging.PackURIHelpePackURIHelperr

Helper class to aid working with URIs

URIs required to GetParts

Create or Get URIs for PackagesPartsRelationship parts

Resolve relative URIs for parts from source to target part

Page 21: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

The Role Of XML With The Role Of XML With DocumentsDocuments

ScenarioScenario ExampleExampleDocument AssemblyDocument AssemblyServer-based or user-assisted Server-based or user-assisted construction of documents from archived construction of documents from archived content or database contentcontent or database content

Create sales reports from financial and Create sales reports from financial and forecast data stored in a CRM systemforecast data stored in a CRM system

Content ReuseContent ReuseMuch easier to move content between Much easier to move content between documents, including different document documents, including different document typestypes

Apply content stored in Word documents Apply content stored in Word documents to Web pages quickly and efficientlyto Web pages quickly and efficiently

Content TaggingContent TaggingAdd domain-specific metadata to Add domain-specific metadata to document content to enable custom document content to enable custom solutions solutions

Tag presentations using a specific Tag presentations using a specific taxonomy to improve knowledge taxonomy to improve knowledge management efficiency management efficiency

Document InterrogationDocument InterrogationQuery document repositories based on Query document repositories based on custom data, content types or document custom data, content types or document metadatametadata

Search for all documents containing a Search for all documents containing a specific company name or sales contactspecific company name or sales contact

Document SanitizationDocument SanitizationRemove unwanted content like Remove unwanted content like comments or embedded code from your comments or embedded code from your document when appropriatedocument when appropriate

Remove all tracked changes and Remove all tracked changes and comments from a Word document before comments from a Word document before it is publishedit is published

Page 22: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Document InterrogationDocument InterrogationScenariosScenarios

When you need meta-data about Office files on a server

Building reports from data in files

Workflow and Content Management scenarios

Validate compliance

Page 23: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Document AnalyzerDocument Analyzer

Office Open XML Formats

Page 24: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Document AssemblyDocument AssemblyScenariosScenarios

Useful when documents need to be generated from structured data

Auto generate reports in Excel from data in databaseCreate documents for users from form dataRepurpose existing data (slide libraries)

Recommendation: Start from a template

Page 25: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Document SanitizationDocument SanitizationScenariosScenarios

SecurityRemove active content (VBA, ActiveX)

PrivacyRemove comments, revisions, hidden textRemove or alter document properties

LegalInsert copyrights, watermarks, images

Run as part of Workflows, publishing, compliance scenarios

Page 26: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

Document SanitizationDocument Sanitization

Office Open XML Formats

Page 27: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

New community formed to help bring developers together

Currently sponsored by almost 40 institutions from around the world

Community and website for information exchangeFree of Charge: Available to everyone that wants to participate; encourage development on all platformsBe one of the first to join the community!http://openxmldeveloper.org

Page 28: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

ResourcesResources

OpenXMLDeveloper.org

Kevin Boske’s Bloghttp://blogs.msdn.com/kevinboske

Brian’s Bloghttp://blogs.msdn.com/brian_jones

WinFX Developer Centerhttp://msdn.microsoft.com/winfx/

Latest CTP:http://msdn.microsoft.com/windowsvista/getthebeta/default.aspx

XPS Blog:http://blogs.msdn.com/xps

Page 29: Open XML Formats Jessica Gruber Consultant Microsoft Corporation.

© 2006 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.