A National Archive Moves to ODF 99
Andy Updegrove writes "The National Archives of Australia (NAA) has announced that it will move its digital archives program to OpenOffice 2.0, an open source implementation of ODF. Unlike Massachusetts or the City of Bristol (which announced it would convert to save on total cost of ownership), the NAA will deal almost exclusively with documents created elsewhere in multiple formats. As a result, it provides a "worst possible case" for testing the practicality of using ODF in a still largely non-ODF world. If successful, the NAA example would therefore demonstrate that the use of ODF is reasonable and feasible in more normal situations, where the percentage of documentation that is created and used internally is much larger."
Get some PRIORITIES! (Score:-1, Offtopic)
Beginning of the Revolution! (Score:2, Insightful)
Re:Beginning of the Revolution! (Score:2)
Re:Beginning of the Revolution! (Score:1)
Even though they have a stack of cash, the change is happening quite quickly, and sending people out to talk to governments, businesses etc around the world costs time and money. For small businesses, it's only worth it so that they don't become poster boys for others. But all efforts so far are not stopping the interest in it.
I know some non-geeky business guys using OOo. One is an IT project consultant, one is a Financial Advisor, and the other is a writer. These are certainly not "compile a distro" guys.
Re:Beginning of the Revolution! (Score:2)
For a more on-topic note, I'm not sure why an office format would be the best thing to use for archives of final documents; why not use something like pdf? Readers are widely available, it will always produce the same results when printed, and it's been around for a while. Plus it's very straightforward to produce a pdf from absolutely any document that can be printed on at least Windows and Unix-like machines (in fact I bet even wierd computers like Macs, Be-Boxes and NeXT cubes can produce pdfs from any print output with a bit of prodding).
Re:Beginning of the Revolution! (Score:3, Insightful)
ODF is an electronic document format, not an "office" format, whatever that means. Its advantage in this context is that any document in ODF can be dissasembled ito its component parts easily. Text, images and formatting can all be extracted and used separately if needed. PDFs are hard to convert back to the raw data.
Re:Beginning of the Revolution! (Score:2)
Of course, you can extract text and images from a pdf; just because Adobe doesn't include the functionality in its reader (a totally artificial restriction that eases the minds of people creating PDFs that they don't want text copied from) doesn't mean it can't be done. Google "pdf extract text" if you don't believe me. Many pdfs even have structural information embedded in them (so you can view a document index and select a section of the document to read, which is really useful for technical specs). Of course, only quality-made pdfs have this, just as quality-made ODFs would.
Although ODF is an XML format, the documents may not be created in a way that takes advantage of that to provide any more-structured information than a flat page of text.
Now if I'm going to have to deal with files in an intermediate format, I'd hope it would be an open and well thought-out standard like ODF. But for final documents that will not have to be re-edited, a "final" or "print" format is the best choice in my opinion (I am not an archival expert).
Re:Beginning of the Revolution! (Score:2)
Yes, but not as straight-forwardly as from a word-processing document. Sometimes the font subsetting makes copying text problematic (uncommon characters come out as a blank when copied). And there is no distinction between line wraps and deliberate line breaks, "real" or soft hyphens, and similar classes of information are obfuscated simply because they're not important to just viewing or printing.
I'm sure the Archive is looking to allowing useful searching of the files, which again is possible with PDF (Google does it), but is much easier and more reliable with a text-based format.
Re:Beginning of the Revolution! (Score:3, Informative)
They had 150 year old documents going back to wherever but they had trouble reading 25 year old floppy disks in weird formats and converting them to the raw text-only format they used back then.
If they standardize on an XML based format like the ODF ones and convert all of their old stuff to this it will make archiving the current documents much easier. It may even in a few years prod the Australian government to standardise on a product that saves to ODF...
First (Score:1, Interesting)
Re:First (Score:-1, Offtopic)
Re:First (Score:5, Informative)
Quality control? (Score:0)
more than that (Score:2)
First, open it in the original app. Use "Save As" to export the file in every possible way. (txt, rtf, ps, pdf, html...)
Second, open the original in OpenOffice 2. Do as above, for every format that OpenOffice can create.
Third, open the original in KWrite...
When done, save the data on many different types of media. Be sure to use long-term-stable storage formats like GNU tar with GNU zip. Be sure to choose media from different manufacturers. Store the data at several different sites, preferably on opposite sides of the Earth.
As the years go by, spot check the data for errors. Keep statistics. If you find that a particular type of media is failing, make new copies.
Re:more than that (Score:1)
Re:more than that (Score:2)
got a better way? (Score:2)
If I had to convert lots of documents I'd write a command-line tool... which ships the file off to a foreign land where I hire poor people to click "Save As..." all day long.
I'd also use other tools, each time converting from the original document. I can then be fairly sure that at least one of the documents, original or converted, will be readable far into the future.
Re:got a better way? (Score:2)
You couldn't be more wrong. MSword is an ActiveX control that Microsoft Word presents a GUI for.
You can script it in VBA or C++ or Python or whatever has bindings for COM in Windows.
Antiword is the best tool I have found for exporting Word docs into plain text.
Using MS Office formats would be much worse (Score:1, Interesting)
With the open, fully-documented ODF formats, any problems down the road can be analyzed, and corrected, but with the secret, proprietary MS Office formats, when a problem occurs, you're stuck!
Thus, if you store your documents in MS Office formats, it means that you have to re-examine your entire archive, every time you update your MS Office software, or add a patch release.
OpenOffice.org 2.0? (Score:0, Flamebait)
Wow, OOo 2.0 supports ODF? That's great news, I've been using one of the other myriad programs* that support ODF.
*Note: Said software doesn't exist.
Doesn't exist? (Score:2, Informative)
Wrong (Score:0)
Re: "other myriad programs*" (Score:-1, Flamebait)
(Pure FUD... are you proudly clueless, or are you 0wn3d by criminal monopolists? If you're only a TROLL, please get off the computer and go eat some donuts.)
Re:OpenOffice.org 2.0? (Score:5, Informative)
Note: Said software doesn't exist.
Get with the times. That hasn't been true for a while. The current list includes: Abiword 2.4, eZ publish, IBM Workplace Documents 2.6+, KWord 1.4+, NeoOffice 1.2 Writer, OpenOffice.org Writer, Scribus 1.2.2+ , StarOffice 8 Writer, TEA text editor , TextMaker 2005, Visioo Writer 0.6, and Writely for the word processor portion of the format, with similar lists for the other components. There are a lot more that have announced support on the way.
Re:OpenOffice.org 2.0? (Score:3, Informative)
Re:OpenOffice.org 2.0? (Score:2)
On the contrary... (Score:-1, Offtopic)
Re:On the contrary... (Score:0)
Re:On the contrary... (Score:2)
I have just sent a letter to Episoft's Chief Software Architect and offered him venture capital. I tell ya, this is all going to make us richer then the Sultan of Bahrain.
Re:OpenOffice.org 2.0? (Score:1)
Get with the times. That hasn't been true for a while. The current list [of apps supporting ODF] includes:
Besides OpenOffice.org and its commercial distribution called StarOffice, which apps on the list [wikipedia.org] 1. run on Microsoft Windows operating systems (so that they don't require re-buying hardware) and 2. are promoted in print or on television across North America or across Europe?
Re:OpenOffice.org 2.0? (Score:2)
I don't get it - since when did you have to re-buy hardware to slap a new OS on it?
Re:OpenOffice.org 2.0? (Score:5, Informative)
ODT
ODS
ODP
Same note on KPresenter as on KWord
ODG
Driver issues; marketing (Score:2, Insightful)
Those which run under Linux probably wouldn't require new hardware either.
Find me a Linux driver for my paid-for yet unsupported [sane-project.org] Microtek Scanmaker 4850 flatbed scanner, which was purchased long before I thought of switching this computer to Linux, and I'll believe you. Unless you are working with a computer that was built from the ground up for Linux, including buying a printed copy of a distribution's hardware compatibility list to carry with you to the computer store, I am 90 percent sure that you will have issues with at least one piece of hardware if you switch a computer from Windows XP to a common Linux distribution.
And what about vertical-market proprietary software intended to run on the same computer, which is either available only for Windows or (if you're lucky) available for multiple platforms but priced such that using multiple platform versions in an organization is cost prohibitive? You would have to use Wine (significant overhead and less than full compatibility) to run your existing licensed software for Windows on a Linux box.
What does [promotion in traditional media read by management] have to do with anything?
It's the same reason most listeners prefer payola'd major label music to independent music: repeated exposure builds familiarity.
I have seen relatively few MS Office, OO.o, or Corel WordPerfect ads either.
Which magazines and which TV channels are you looking at? In the news magazines and cable news channels, I see a whole bunch of advertisements for Microsoft Office software.
People giving away software usually don't spend money to ensure you'll take it from them.
Then why doesn't Sun advertise its StarOffice software, the official commercial distribution of OpenOffice.org? Or by "giving away software" do you also mean "we're practically giving it away", that is, budget software?
Re:Driver issues; marketing (Score:1, Interesting)
This being said, there are OO.o bus ads [linux-watch.com], and I'm sure they've done ads in trade publications as well.
Re:Driver issues; marketing (Score:0)
Re:Driver issues; marketing (Score:1)
Why the hell would anyone pay money for an HCL? Most are freely available. Print it.
Unless you're buying a printer. In addition, I intended "buying a printed copy" to include the price of ink, paper, printer wear and tear, and (in the case of buying a printer) FedEx Kinko's markup on the above.
Or buy a desktop from a vendor that sells Linux desktops
Where can I find one in Fort Wayne, Indiana? I'm unfamiliar with which keywords I would use to find local PC shops whose x86 offerings aren't Windows-only. Your Mac analogy holds less than 100% because I can always look in the Yellow Pages for an Apple reseller logo.
Re:Driver issues; marketing (Score:1, Informative)
Re:Driver issues; marketing (Score:1)
Since you're such a cheap bastard, you could try memorizing the HCL. Of course, even that isn't free, especially if you pay for bandwidth by the byte, as I presume someone so miserly would.
This series of posts has to be the dumbest troll I've seen in weeks.
Re:Driver issues; marketing (Score:2, Funny)
Or, better, bring a Linux LiveCD & actyually try it.
Oh, that sounds like fun. I'm going to take a Knoppix DVD with me the next time I go to Best Buy. Just imagine how the salespeople will freak out when they think I've reinstalled the OS on one of their display machines -- and then I can watch their heads explode as they try to get their minds around the idea of a LiveCD.
Re:Driver issues; marketing (Score:2)
Re:OpenOffice.org 2.0? (Score:4, Informative)
Some highlights according to wikipedia:
Plus StarOffice (maybe that's cheating), and IBM Workplace Documents (never used it)
Re:OpenOffice.org 2.0? (Score:3, Informative)
KOffice?
Not that that makes it a myriad, but there are also a few lesser-known programs that do, and I would guess that many others will implement support for it soon. AbiWord didn't last time I checked, but they did support SXW (StarOffice/OpenOffice.org Writer 1.x format), so it wouldn't surprise me to see them implement ODT. Actually ... oops ... I lied, looks like it does now: http://en.wikipedia.org/wiki/List_of_applications_ supporting_OpenDocument [wikipedia.org]
Anyway, the OpenDocument Alliance also has a lot of companies behind it, among them IBM and, of course, Google. So it seems to be a pretty strong format to me, even if that one company from Redmond (what's their name again?) isn't particularly interested right now...
Worst Possible Case? (Score:2, Interesting)
Wouldn't this sort of test be a more or less good test case for switching to ODF and dealing with non-ODF outside documents? Maybe I just misunderstood the comment.
Re:Worst Possible Case? (Score:1)
Re:Worst Possible Case? (Score:3, Interesting)
Re:Worst Possible Case? (Score:5, Interesting)
What I think they meant to convey is that this will be a worse case scenario they can use for testing the practicality of using ODF in a non-ODF world.
But I don't actually think so...
Whereas I think this will be great for ODF, as the NAA will have to produce heaps conversion software to convert many formats to ODF but because they are an archiving operation, they won't ever have to convert back. Instead, I imagine that the common document format for outgoing files from of the archive will most likely be PDF...
This scenario won't test the ability for ODF in collaborative work among entities, something that I would see as the worst case scenario needed to test the practicality of using this format.
Having said all of that - to hell with everyone else - I have been using non Microsoft formats (first Star Office formats and now ODF) for five years now and rarely come across a problem. Then again, I am a simple user so I wouldn't expect too much grief. From my experience advising other people I can see that the true hurdle is not the file format, rather the application. Word and Excel are automated from so much business and scientific software that people just expect the results of their query or analysis to be dumped directly into their spreadsheet or word processor. So until Quicken or MYOB support something other that MS software, or until alternative software is produced that does, business will largely use MS.
On the other hand I strongly recommend to people to use OOo at home and with the ever increaseing compatability that OOo has with MS formats, this is not a bad option.
Quicken already on OS X (Score:1)
So there's nothing in that regard keeping small businesses on Windows, unless they happen to like the extra mainenance.
It's also useful for recovering corrupted MS Office files, which you will get eventually. One thing that people tend to forget is that you can install OOo along side MS Office or anything else you may already have. The 'rip and replace' theme is just a bunch of scare mongering from Redmond. Having both means you can swap between them as you like or even just keep one in reserve in case of problems.OpenDocument is definitely the way to go, especially for spreadsheets. Being a zipped XML file means that you can massage large data sets alternately with a comfortable gui or with home grown perl/python/ruby/whatever scripts.
Re:Quicken already on OS X (Score:1)
MS Office is horrible to do the same thing.
Re:Worst Possible Case? (Score:1)
Unfortunately, in that case, OOo didn't cut it. Does anybody know whether this is something that the OOo folks are working on? I wonder if this would have implications for some of the documents that are being imported and converted in the headline story...
Re:Worst Possible Case? (Score:1)
But how will people read these on their xBox360s? (Score:0, Troll)
Bristol, UK? (Score:2)
Re:Bristol, UK? (Score:2, Informative)
Yes, Bristol UK.
Re:Bristol, UK? (Score:3)
Re:Bristol, UK? (Score:2, Informative)
Re:Bristol, UK? (Score:1)
Well if they make enough money through their database work [opensourceacademy.gov.uk] with the Open Source Academy then you might.
Don't hold your breath though.
They may choose to make their website a bit better - but it is better than Cardiff's website which is dire to the extreme - capital of Wales and using front page?!? I resent every penny of my council tax spent on that dross! (Sorry had to get it out of my system :-)
Novell has great success with this (Score:5, Informative)
OOo shows how bad Java can be (Score:2, Funny)
Re:OOo shows how bad Java can be (Score:-1, Redundant)
Re:OOo shows how bad Java can be (Score:2)
Re:OOo shows how bad Java can be (Score:2, Informative)
Re:OOo shows how bad Java can be (Score:5, Informative)
No it isn't. I just ran OpenOffice writer V2.0 and checked my task list. No java was running at all!
OOo uses java for some functions but it in not "largely impelemented using a Java VM-based" anything
http://en.wikipedia.org/wiki/OpenOffice.org#Java_
OpenOffice is mostly a C++ or C program.
I have not run a profiler on OOo so I can not tell you 100% what makes OO slower than Office but I would guess that part of it is the XML format that OO uses.
Just from my own experence I have found that you can write a fast XML parser and you can write a "safe" XML parser. But a fast safe XMP parser is very hard.
Re:OOo shows how bad Java can be (Score:2)
Yep. Speed is the main advantage of binary formats (since we now compress the textual ones). But I'd like to add that I'm using OOo for a few mounths now without ever needing to enable the java functionality (you can disable java at the configuration window). Almost all of it is written in other languages.
Bingo! (Score:0)
Ha! Got you.
OOo is slow because it's still largely impelemented using C++ with all that entails.
I just wanted to get you to admit that.
Re:Bingo! (Score:3, Insightful)
There is certainly no reason to believe it is slow BECAUSE C++ was used. One can write a slow app in any language. It is just a bit easier to do in an interpreted language like Java than in a compiled language like C++.
P.S. Don't tell me that Java compiles to bytecode. That just means that Java compiles to an interpreted language instead of a native language.
Re:Bingo! (Score:1)
Re:Bingo! (Score:2)
Yes, and that is done with a performance penalty like every other interpretation. Even though that is done completely in hardware. However, java is translated to bytecode, the bytecode in turn has to be translated at runtime to X86 machine code, that machine code in turn STILL has to be interpreted yet again by the chip itself to native opcodes. The X86 translation occurs with every language including ASM so it really is not relevant to the conversation at hand.
"The importance of "interpreted" versus "compiled" stopped being relevant some time ago."
It will stop being relevant when it stops carrying a performance overhead, this will occur sometime in the ballpark of never.
"The main issue now is just choosing the trade-off between speed and correctness (java VMs can be made reasonably faster if you turn off some of the safety checks; I'm not sure if that tuneability exists outside of the embedded VM implementations, though)."
Within the VM I am sure that is true. In reality if performance is an issue an interpreted language is generally not the correct choice. Java is however, one of the better performing interpreted languages.
Re:OOo shows how bad Java can be (Score:3, Funny)
Actually, OOo is so slow because they don't use a widget set. The display is hand-drawn by a bunch of monks in Germany, because the project started before Qt or Gtk.
Re:OOo shows how bad Java can be (Score:2)
I have never noticed OO.org was slow except when saving or loading a file.
I have worked with XML parsers and find them really slow when dealing with a file with a few hundred thousand elements.
Re:OOo shows how bad Java can be (Score:2)
Re:OOo shows how bad Java can be (Score:2)
Re:OOo shows how bad Java can be (Score:2)
Think of it as a gateway drug. Once you have them hooked on Firefox you get them to try OOo. Next thing you know you got them hook and they installing Gentoo. After that the only way to get them back on Windows is a 12 step program.
Re:OOo shows how bad Java can be (Score:3, Interesting)
So does increasing the memory settings.
However it still takes about 3 or 4 seconds to start up on my desktop. As far as I remember from when I still used Windows this is not all that different from MS Office on XP on similar hardware. Does any one else who has done the same tweaks differ?
However Abiword or Lyx starts instantly. I mostly use Lyx (which I find more productive) and Gnumeric (faster, with some nice features) rather than OO.
Re:OOo shows how bad Java can be (Score:0)
Re:OOo shows how bad Java can be (Score:2)
Try saving as OOo format and then reopening. It's just as fast as MSO.
J.
Re:OOo shows how bad Java can be (Score:2)
If you turn off java, the entire program is tons faster and all you lose are macros.
Re:OOo shows how bad Java can be (Score:2)
My question for OpenOffice guys, why have this turned on by default?
Those who would need it can turn it on but I always thought OpenOffice was a dog because of this.
Maybe a pop-up window or something...
Re:OOo shows how bad Java can be (Score:2)
Good idea! Sounds much more open than this silly ODF format....
Apologies for sarcasm, but even if you're not into the political and social reasons for Open Standards, a closed, pervasive document format is A Bad Thing(TM). And when you get past the poor PR attempts, Office XML is still a closed, soon to be pervasive format. Hence it's A Bad Thing(TM).
Re:OOo shows how bad Java can be (Score:2)
Funny is what people mod when they would like to mod something "So wrong it's not even... funny".
You're either a really crap troll/shill or substantially misinformed about OOo.
Justin.
PS What does hemi-dramatically mean?
WTF? (Score:0)
Ahh, I see. If NAA can use ODF, would they continue and go the route of FOSS? Or should they stay with ODF-only for the time being and then migrate to FOSS from MS? Of course, if they're on BSD, the transition to ODF via FOSS would pose a problem with ASAP implementations unless, they're hired IBM to implement their FOSS, ODF, BSD, Linux migration. OTH, using SAP in conjunction with ODF and FOSS would possibly lead to ...oh, I'm cross-eyed.
Small experience (Score:5, Interesting)
All documents were made with a flavour of Word or another, from word for MacOS 6.0 to the latest (at the time) word XP for windows. As you'd have already guessed, the only word processor able to make sense of all the documents at once was Openoffice.org. Of course, I faced issues (bulleting appearing "funny", for instance), but as I was applying a style I created, that was not a problem as long as the text was there.
No single version of word in my possession was able to open all the documents, some documents even crashing word XP with thunder and lighting.
Re:Small experience (Score:2)
MS Office Macro's and such (Score:1)
Taken (Score:2, Informative)
OK why is the little o included in the name? Its just Open Office. OOo is a website that Has OO. I don't get it.
If this Wikipedia article [wikipedia.org] is to be believed, then the name of the web site, project, and product is "OpenOffice.org" because "OpenOffice" was taken.
MS Macros rewrite MS template most work. (Score:0)
MS Macros are loaded by Open Office but rem out because they contain calls not compad with Open Office. Star Office has a interface layour.
Ie Buy Star Office it will use MS Macros move across macros over time. Update VBA macros to SBA Macros Ie Visual Basic Star Basic. Then switch to Open Office ie it runs SBA.
Re: templates (Score:3, Informative)
I don't think they match up to the beauty of (some) MS or Corel templates , but StarOffice has some templates you could steal from I bet. Would those be freely distributable under their license?
Anyway, http://ooextras.sourceforge.net/ [sourceforge.net]
that's the
Re: templates (Score:0, Troll)
Re: templates (Score:1)
Re:MS Office Macro's and such (Score:1)
I've seen people do it, and often they collect the data, which gets pasted to the word doc, printed and saved.
Which means that the data can't be analysed or transformed easily, and it's all over the place.
What you really need is a simple application, which has the functionality to produce a print.
That said, Macros can be done in OpenOffice.org too. But need some manual conversion.
Questions here (Score:5, Informative)
If anybody wants to ask any questions here I'll try and answer.
Re:Questions here (Score:1)
The current versions of Office for OS X can correctly read 5.x files but no open source app I've found so far can. Its file format is different from the Windows version.
12 years' worth backsupport sounds good until you realize the application's 20 years old. Are you going to do what OOo won't?
Old MS Word 5.x for Macintosh files (Score:1)
I was unable to find a bug report on the bug list [openoffice.org] requesting the ability to import those files. Though that may be my inability to use the database. Have you tried filing a description of the problem [openoffice.org] ? If it's not on the list of things to do, it can't be addressed. However, realize that this would mean reverse engineering the old MS formats. MS, despite court orders from courts on both sides of the Atlantic, has not turned over any documentation for its file formats. So it's not a clear cut task.
Re:Questions here (Score:2)
I very much doubt the NAA will do anything that OOo won't. They don't have enough resources.
Re:Questions here (Score:0)
Go SVG! (Score:0)
This could really be the thing that separates content from
implementation in business presentation software.
In my opinion, this is the last area to solve for the computing public to
break free of implementation handcuffs in the desktop and productivity tools
marketplace.
I would be so excited to see an SVG based destop implementation in pure SVG
(when it matures). I know that Apple used a postscript implementation but this would be free and standardized.
Graphical content could be almost drag and drop onto the workspace!
I say again, Go SVG!
OOo2.0 is just one facet (Score:5, Informative)
Our use of the OpenDocument format will be quite important, but it's only one facet of what we do. The Xena software has been developed with a plugin architecture that lets us use various external helpers to 'normalise' or convert to open formats any data objects in our care. For each data object, we use Xena to create a base64 encoded copy so that we can embed some metadata with it, and separately for a conversion to an open format. Much of the data ends up as XML, while images for example are png or jpg. We're currently investigating open audio formats. Xena is also used to 'present' data objects that it normalises.
Until now, Xena has made use of OOo 1.1.x for the normalising of office documents into flat XML. Other development priorities have kept the move to OOo2 in the background. I must stress that we have not yet released Xena with OOo2 support, there is more testing to be done and we feel that the release must be accompanied by good user and developer documentation.
The 'current' binary of Xena available at sourceforge is waaaaay out of date and will shortly be replaced by a much sleeker and more intuitive version. For the curious, anonymous cvs is pretty up to date. If you have a java 1.5 sdk and apache ant, check out a pile of modules and go nuts. Anyone who wishes to become involved in the development effort is more than welcome.
For anyone else, keep an eye on the http//xena.sourceforge.net/ [slashdot.org] for the upcoming binary release.
Re:OOo2.0 is just one facet +5 INFORMATIVE (Score:2)
Even PDF is not well-suited for archiving (Score:2)
But at least they get rid of MS Office....
Re:Even PDF is not well-suited for archiving (Score:1)
nonetheless, awesome job NAA!!!