OpenDocument Scripting
-
Upload
marco-fioretti -
Category
Software
-
view
437 -
download
14
description
Transcript of OpenDocument Scripting
ODF Scripting:
how ODF makes it easy to tell your computer to do your office
work for you
ODF Scripting
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 1
● Marco Fioretti
● Freelance writer, trainer, activist● Linux Journal Contributing Editor, contributor of Pc Professionale,
Linux Format and other magazines
● Author of the Family Guide to Digital Freedom
(http://digifreedom.net)
● Co-author of the O'Reilly book on Open Government, 2010
● Member of the ODF Fellowship and Digistan.org
● Advanced ODF user, but not a programmer!
Author intro
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 2
•a simple and effective way to quickly write scripts to:
● generate, filter or process ODF texts, presentations and spreadsheets
● particularly productive on low volume, but boring and repetitive tasks
● made possible by the openness and simplicity of the OpenDocument Format
● ...but still unknown to most (potential!) users of ODF and OO.o
What is ODF Scripting?
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 3
What I call ODF Scripting is based on two facts:
• any ODF file is just a ZIP archive containing plain text files and other objects (e.g. images) in normally standard formats
• there are lots of FOSS utilities and scripting tools made just to process plain text and working on every platform
How does ODF Scripting work?
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 4
What's inside an ODF file?
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 5
Every OpenDocument file is simply a compressed Zip folder containing various elements. Some of them are listed here:
content.xmlthe actual textual content of the document.Complex XML markup, but still readable by humans
meta.xmlMetadata like Author name, Word count, Language, Date of last modification, etc
styles.xmlStyle information like like font size, colour, page widthfor pages, characters, paragraphs...
Separate folders for binary objects●Images●Macros●....
● create (ONCE per project!) an empty document with placeholder strings
● write simple, ad-hoc shell/Perl/whatever short scripts that:
● Unzip that empty document
● Replace placeholder strings in content.xml with actual values from text files or database queries
● Zip everything together and gives the new file the right extension
ODF Scripting document generation
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 6
● study the structure of the ODF file(s) to process,
to find out which XML fields and/or text values
are interesting
● write ad-hoc shell/Perl/whatever scripts that:● unzip the ODF file
● extract relevant strings via Perl/grep/awk whatever
● process them as needed
Document analysis and processing with ODF scripting
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 7
Examples (1): Invoice generation
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 8
marco => cat my_data.sh INVOICE_DATE='2010/05/15' VENDOR_CODE='007' PO_NUMBER='Purchase Order #1' TOTAL=10 ISSUE=150 DESCRIPTION='Here is your invoice'
1 ASCII data file
+ 35 lines shell script
•Time-of-day BW 1 BW2• Midnight 4.5 6.4• 6.3 6.3• 3.1 6.1• 1.85 5.87
Example (2): Spreadsheets with graphs and formulas from log files
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 9
+
Still editable formulas!!!
• </table:table-row>MY_DATA_GO_HERE</table:table>
How were the spreadsheets generated?
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 10
27 lines shell script calling 66 lines Perl script
That substitute placeholders like this in the XML files:
with snippets of XML code (copied by those same files) that describe table cells, but whose numeric contents were loaded from the ASCII input file
A: with the same trick used in the spreadsheet example, same scripts complexity
Examples (3):Slideshow drafts from plain text outlines
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 11
Q: how did I generate the first version of this slideshow?
Examples (3):Slideshow drafts from plain text outlines
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 12
● Image processing with ImageMagick:
● unzip the ODF file● process in Shell loop every file in the Image folder, using
composite, convert or similar ImageMagick utilities● zip everything together, assign the proper extension
● Practical uses:
● add watermark or caption to each image in a collection of ODF files
● reduce resolution, to save disk space● replace company logos or other clipart
Other ODF scripting recipes I plan to write
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 13
● Metadata processing:
● Unzip the ODF file
● Use grep/Perl/sed/whatever to replace orupdate
the current values of
● phone numbers, addresses, names
● author name or any other metadata
● zip everything together, assign the proper extension
Other ODF scripting recipes I plan to write (2)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 14
● Add data from ODF files to databases or generate graphs:
● Unzip the ODF file● grep interesting strings from metadata.xml or
content.xml● add them to database, generate graphs with
gnuplot...•
•Example: extract answer to multiple choice tests from .odt files received via email to calculate student grades, average...
Other ODF scripting recipes I plan to write (3)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 15
● Courseware!
● Generate didactical DVDs from the same notes used to generate ODP slideshow
● Generate multiple choice tests from sources in the same language used in Moodle to run the same tests online (http://docs.moodle.org/en/GIFT)
Other ODF scripting recipes I plan to write (4)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 16
•ASCII source with GIFT markup:•
•Thanksgiving is celebrated on the {• ~second• ~third• =fourth•} Thursday of November.•
● Courseware again
● Use the same approach to generate math exercises automatically
● import in ODF file formulas created with Mathematica and saved as MathML
Other ODF scripting example, from Rob Weir
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 17
● Very simple method, simple to learn and use whenever one needs to save time
● Flexible: everybody can use his or her preferred scripting or source markup language: it's a way of working, not a program!!!
● Very portable, with the smallest possible number of dependencies
Pros of ODF Scripting (1)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 18
● Huge time saver in many cases when an industrial strength ODF/XML processor cannot be installed or would be an overkill (to learn, at least)
● SIMPLE and perfectly adequate to the real needs of many home,school or SME users: save time on boring simple, repetitive, neverending modifications or analyses of files that have the same structure
● useful for webmasters that need to assemble and serve ODF stuff on the fly
Pros of ODF Scripting (2)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 19
● can mix, reuse or generate on the fly the "source code", that is the strings from plain text files, databases and what not that must be inserted in the ODF files
● Does not need OpenOffice.org or any other office suite! Perfect for servers or very limited systems
● integrates well with any other command line data processing tool (including OpenOffice.org, when necessary)
Pros of ODF Scripting (3)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 20
● Most important advantage:
● ODF Scripting is much easier and much more acceptable from the psychological point of view!
● easier than LaTeX and, unlike LaTeX:
● accepts standard office documents as „input”
● produces stuff DIRECTLY readable and EDITABLE with "normal" office suites
Pros of ODF Scripting (4)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 21
● The ODF Scripting way of working is "real-world-office-ready", that is:
• compatible with secretaries and existing material (e.g. corporate templates in ODF format)
• results can even be converted (even running OO.o in a script) to MS Office formats if really, really necessary (but why??? It's already ODF!)
Pros of ODF Scripting (5)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 22
● not really scalable to flexible processing of complex ODF documents (it's a way to quickly write single-purpose, throw-away tools)
● Less performing than other solutions
● but... who cares??? No, really!● those users for which these are serious issues already
have real XML parsers and other similar tools
● ODF scripting is for all the others
Cons of ODF Scripting
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 23
● the biggest obstacles are the psychological barriers:
● generic fear or hate of the command line (you need to learn simple shell scripting to do ODF scripting)
● fear to mess with/inside objects (office files) that aren't believed to be touchable by mere humans:
● "if it were really that simple, why on Earth would we need a very expensive/complex software just to write a letter with justified paragraphs?"
Cons of ODF Scripting (2)
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 24
A most important advantage of ODF Scripting is
proving that those taboos are wrong and that
office files are something that normal people
CAN handle by themselves, and on which they
can have complete control without asking
"permission" to anybody
Cons or opportunity?
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 25
Feedback from users
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 26
Wow! This opens windows of opportunities!
Somehow I've totally missed out on the fact that odt documents are just zip
files...Use OpenOffice as a mark-up language... that is just frigging
awesome!
● flood the world with ODF files!!!
● stimulate cultural change:
● prove the openness, freedom and robustness of ODF and its ecosystem
● encourage migrations: it proves that converting legacy collection of old corporte templates, reports and such isn't NECESSARILY something that requires expensive consulting
How can ODF Scripting help to promote OpenOffice?
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 27
● ODF Scripting is not an application, it's an attitude:
● it's just being aware that, thanks to ODF, there is a SIMPLE way to write quick and dirty scripts to save lots of time
● it is nothing new, really. It's just that too few people already know how cool and easy it is
● can make more people love ODF and OO.o
Summary
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 28
● make it easy to use for Windows users: bundle example scripts with CygWin live/virtual environments
● optimize existing scripts? Probably not worth it.
● Write simple GUIs for them? Hmm... what do you think?
● Prove its potential for schools
● (for me) figure out how to mix whole (groups of) ODP slides from existing ODP slideshows
What's next?
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 29
● Resources
● the „OASIS OpenDOcument essentials” book http://books.evc-cit.info/
● My ODF Scripting pages (which will gladly host also 3rd party recipes!)
http://freesoftware.zona-m.net/odf-scripting•
● Questions?•
● Contact info: [email protected] or http://mfioretti.com•
Thanks!
Conclusion
Marco Fioretti OOOCon 2010http://mfioretti.com 2010/9/2 Budapest 30