Open XML Formats Jessica Gruber Consultant Microsoft Corporation.
-
Upload
annis-mckenzie -
Category
Documents
-
view
218 -
download
1
Transcript of Open XML Formats Jessica Gruber Consultant Microsoft Corporation.
Open XML FormatsOpen XML Formats
Jessica GruberConsultantMicrosoft Corporation
AgendaAgenda
Introduction to the new formats Introduction to the new formats (demos)(demos)
Developer view of the Open XML Developer view of the Open XML FormatsFormats
Tools for working with Open XML Tools for working with Open XML
Files Files (demos)(demos)
Server development scenarios Server development scenarios (demos)(demos)
Question and AnswerQuestion and Answer
Session GoalsSession Goals
Understand the benefits of Office Open XML solutions for developers
Introduction to Microsoft tools for Open XML development
Learn basic developer patterns forWorking with Packages and Parts (Zip)
Navigating through relationships (XML)
Manipulating data within parts (XML)
Kick off the conversation on OpenXMLDeveloper.org
Open XML FormatsOpen XML Formats
New XML file formats for Word, Excel and PowerPoint
New formats will be the default file formats, with new file type extensions (.docx; .pptx; .xlsx)Fully 100% compatible with existing formats
Open, transparent format improves interoperability
XML - Transparent, XML format enables new integration scenarios for documents and LOB systemsZIP container - allows for standard compression on all files without user effortLicensing - Removed need for license by providing a Covenant that says we won’t enforce IP against folks implementing the format (100% royalty free)
StandardizationEcma International - created TC45 to fully document the Open XML formats
Members include: Apple, Barclays Capital, BP, the British Library, Essilor, Intel Corporation, NextPage Inc., Statoil ASA and Toshiba Current spec is already over 2000 pages
Open XML Formats (cont’d)Open XML Formats (cont’d)
Added Benefits: compact and robustZIP container allows for standard compression on all files without user effort (Dramatic file size improvements)Significantly more robust files to help minimize data loss
Backward Compatible: Office 2000, Office XP, Office 2003 will all support the new formats
Patches for compatibility available by launchOpen, edit and save new formats
Legacy support: Current Office 97-2003 binary file formats supported
Support for XML formats from Office 2003, Office XP continued
Developers: Endless potential for developers Build solutions to read, write, and modify Office files (without the need to run Office APIs)
Evolution Of File FormatsEvolution Of File Formats
Office 2000Office 2000Early InnovationEarly Innovation
XML document XML document propertiesproperties
Office 97Office 97Existing binary file formats Existing binary file formats designed in 1994, launched designed in 1994, launched in Office 97in Office 97
Office XPOffice XPFirst XML First XML FormatFormat
Spreadsheet Spreadsheet XMLXML
Office 2003Office 2003Breakthrough XML Breakthrough XML SupportSupport
WordML, SpreadsheetMLWordML, SpreadsheetML
Custom-defined schemaCustom-defined schema
Office 2007Office 2007New XML FormatsNew XML Formats
XML file format defaultXML file format default
XML PowerPoint formatXML PowerPoint format
““Wave 12”Wave 12”
Components of the new Components of the new formatsformatsPackage – ZIP ContainerPart – The “files” inside the ZIPContent Types – Each part has a content type that is enforced on openRelationships – Any part that references another part must do so via a relationship
Document Properties
Application Properties
Custom Doc. Props.
Workbook
Sheet 2
Sheet 3
Sheet 1 Styles
Chart
Strings
Relationship
...
...
Introduction to the new Introduction to the new File FormatFile Format
Document AnalyzerDocument Analyzer
Office Open XML FormatsOffice Open XML Formats
User view: single Office User view: single Office “file”“file”
Questionaire.docxFile ContainerFile Container
Document PropertiesDocument Properties
CommentsComments
ChartsCharts
Embedded code / macrosEmbedded code / macros
Images, video, soundImages, video, sound
Custom-defined XMLCustom-defined XML
WordML / SpreadsheetML, etc.WordML / SpreadsheetML, etc.Document PartsDocument Parts
Most parts are XMLMost parts are XML
Each XML part is a discreet, Each XML part is a discreet, compressed componentcompressed component
Can add, extract and modify individual Can add, extract and modify individual parts using any Zip implementationparts using any Zip implementation
Corruption or absence of any part Corruption or absence of any part would not prohibit the file from being would not prohibit the file from being openedopened
Developer view: modular Developer view: modular filefile
Open XML Formats Open XML Formats ArchitectureArchitecture
Benefits Of Open XML Benefits Of Open XML SolutionsSolutions
No longer need to automate client applications to open and modify files
Unsupported solution on serverClients not designed for this scenario
ReliabilityAccess directly the parts you needAvoid corruption
TransparencyDirect access to your data!
Tools for Accessing Data Tools for Accessing Data In Office Open XML filesIn Office Open XML files
XML EditingNotepad?System.XML makes this easier
ZIP ManipulationCompressed Folders in Windows? Third-Party Zip LibrariesMicrosoft’s Packaging API’s
Office Open XML Resource KitCode Snippets - Beta 2
C# and VB.NET
Validation LibraryParses a file and reports on schema, relationship errors and warnings
Schemasalready in the ECMA document
System.IO.PackagingSystem.IO.Packaging
Part of Windows Presentation Foundations (WinFX)
Ships with Vista Betas available now
Requires .NET 2.0
Enables package manipulation forOffice Open XML File FormatsXML Paper Specification FilesAny Open Packaging Convention files
PackagePackagePackagePartPackagePartPackagePartCollectionPackagePartCollectionPackageRelationshipPackageRelationshipPackageRelationshipCollectionPackageRelationshipCollectionPackUriHelperPackUriHelper
System.IO.Packaging System.IO.Packaging API functionalityAPI functionality
Create/Open packagesCreate and delete parts and relationshipsRead and write part streamsIterate through collections of parts and relationships
Reading Data From Reading Data From FilesFiles
Microsoft Office PowerPoint 2007Microsoft Office PowerPoint 2007
System.IO.Packaging.System.IO.Packaging.PackagPackagee
Package class provides methods to create, enumerate and delete the following entities
Package
Package Relationships
PackageProperties
Parts
Pack
ag
e R
ela
tionsh
ips
Pack
ag
e R
ela
tionsh
ips
Core PropertiesCore Properties
Common Package Common Package PartsParts
ThumbnailThumbnail
Digital SignaturesDigital Signatures
officeDocumentofficeDocument
XML PartXML Part
XML PartXML Part
Specific Format Specific Format PartsParts
Etc…Etc…
Part
Rels
Part
Rels
XML PartXML Part
Part
Rels
Part
Rels
System.IO.Packaging.System.IO.Packaging.RelationsRelationshiphip
Relationships tie the parts together
Required to find parts (part names are not guaranteed)
Iterate through RelationshipCollection by Type or ID
Relationship PropertiesIDPackageRelationshipTypeSourceUriTargetModeTargetUri
Pack
ag
ePack
ag
e R
ela
tionsh
ips
Rela
tionsh
ips
Core Properties
Common Package Parts
Thumbnail
Digital Signatures
officeDocument
XML Part
XML Part
Specific Format Parts
Etc…Etc…
Part
Rels
Part
Rels
XML Part
Part
Rels
Part
Rels
officeDocumentofficeDocument
XML PartXML Part
XML PartXML Part
XML PartXML Part
System.IO.Packaging.System.IO.Packaging.PackagePPackagePartart
Parts are the objects of data within the Package
PackagePart provides support to create, enumerate and delete part relationships
Get Part data as Stream
PackagePart Properties:
CompressionOption
ContentType
Package
Uri Pack
ag
e R
ela
tionsh
ips
Pack
ag
e R
ela
tionsh
ips
Core Properties
Common Package Parts
Thumbnail
Digital Signatures
officeDocumentofficeDocument
XML Part
XML PartXML Part
Specific Format Specific Format PartsParts
Etc…Etc…
Part
Rels
Part
Rels
XML PartXML Part
Part
Rels
Part
Rels<w:body>- <w:p w:rsidR="001B7EF4" w:rsidRDefault="001B7EF4">- <w:r> <w:t>The Quick Brown Fox jumped over the river.</w:t> </w:r> </w:p>…
XML PartXML Part
<w:body>- <w:p w:rsidR="001B7EF4" w:rsidRDefault="001B7EF4">- <w:r> <w:t>The Cow jumped over the moon.</w:t> </w:r> </w:p>…
System.IO.Packaging.System.IO.Packaging.PackURIHelpePackURIHelperr
Helper class to aid working with URIs
URIs required to GetParts
Create or Get URIs for PackagesPartsRelationship parts
Resolve relative URIs for parts from source to target part
The Role Of XML With The Role Of XML With DocumentsDocuments
ScenarioScenario ExampleExampleDocument AssemblyDocument AssemblyServer-based or user-assisted Server-based or user-assisted construction of documents from archived construction of documents from archived content or database contentcontent or database content
Create sales reports from financial and Create sales reports from financial and forecast data stored in a CRM systemforecast data stored in a CRM system
Content ReuseContent ReuseMuch easier to move content between Much easier to move content between documents, including different document documents, including different document typestypes
Apply content stored in Word documents Apply content stored in Word documents to Web pages quickly and efficientlyto Web pages quickly and efficiently
Content TaggingContent TaggingAdd domain-specific metadata to Add domain-specific metadata to document content to enable custom document content to enable custom solutions solutions
Tag presentations using a specific Tag presentations using a specific taxonomy to improve knowledge taxonomy to improve knowledge management efficiency management efficiency
Document InterrogationDocument InterrogationQuery document repositories based on Query document repositories based on custom data, content types or document custom data, content types or document metadatametadata
Search for all documents containing a Search for all documents containing a specific company name or sales contactspecific company name or sales contact
Document SanitizationDocument SanitizationRemove unwanted content like Remove unwanted content like comments or embedded code from your comments or embedded code from your document when appropriatedocument when appropriate
Remove all tracked changes and Remove all tracked changes and comments from a Word document before comments from a Word document before it is publishedit is published
Document InterrogationDocument InterrogationScenariosScenarios
When you need meta-data about Office files on a server
Building reports from data in files
Workflow and Content Management scenarios
Validate compliance
Document AnalyzerDocument Analyzer
Office Open XML Formats
Document AssemblyDocument AssemblyScenariosScenarios
Useful when documents need to be generated from structured data
Auto generate reports in Excel from data in databaseCreate documents for users from form dataRepurpose existing data (slide libraries)
Recommendation: Start from a template
Document SanitizationDocument SanitizationScenariosScenarios
SecurityRemove active content (VBA, ActiveX)
PrivacyRemove comments, revisions, hidden textRemove or alter document properties
LegalInsert copyrights, watermarks, images
Run as part of Workflows, publishing, compliance scenarios
Document SanitizationDocument Sanitization
Office Open XML Formats
New community formed to help bring developers together
Currently sponsored by almost 40 institutions from around the world
Community and website for information exchangeFree of Charge: Available to everyone that wants to participate; encourage development on all platformsBe one of the first to join the community!http://openxmldeveloper.org
ResourcesResources
OpenXMLDeveloper.org
Kevin Boske’s Bloghttp://blogs.msdn.com/kevinboske
Brian’s Bloghttp://blogs.msdn.com/brian_jones
WinFX Developer Centerhttp://msdn.microsoft.com/winfx/
Latest CTP:http://msdn.microsoft.com/windowsvista/getthebeta/default.aspx
XPS Blog:http://blogs.msdn.com/xps
© 2006 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.