Forgot your password?
typodupeerror
Software Australia Government IT Technology Your Rights Online

Aussie Government Gives PDF the Thumbs Down 179

Posted by timothy
from the how-was-this-study-published? dept.
littlekorea writes "The central IT office of the Australian Government has advised its agencies to offer alternatives to Adobe's Portable Document Format to ensure folks with impaired vision are able to consume information on the Web. A Government-funded study found that PDFs can present themselves as image-only files to screen readers, rendering the information contained within them unreadable for the vision impaired."
This discussion has been archived. No new comments can be posted.

Aussie Government Gives PDF the Thumbs Down

Comments Filter:
  • by Tumbleweed (3706) * on Wednesday December 01, 2010 @02:49AM (#34401134)

    A thumbs down in the southern hemisphere is the same as a thumbs up in the northern hemisphere, as long as you name the file bruce.pdf. It saves confusion.

  • Plain text? (Score:3, Interesting)

    by inflex (123318) on Wednesday December 01, 2010 @02:51AM (#34401152) Homepage Journal

    Other than plain text, are there really many other alternatives which don't endure levels of difficulty. Only other options I can see out there at the moment are ePub, simplified HTML or RTF - but of course then they all fall short of the possibly desired 'fancy formatting'.

    As someone will likely also mention, why not just mandate that the PDF contents are actually text, as opposed to images (which is annoying to anyone!).

    • by robbak (775424) on Wednesday December 01, 2010 @03:27AM (#34401366) Homepage

      And the rest of us say "Get rid of it". We do not access government documents to be blown away by their totally rad page style. We access them for information, and extracting the information from the glumph that encases it is sometimes hard for the best of us.

      html all the way. Any formatting you cannot fit in a simple stylsheet can get left out.

      • Re: (Score:2, Insightful)

        by drinkypoo (153816)

        Congratulations, you have jut declared that you do not wish to be able to download forms over the internet.

        • by Abcd1234 (188840)

          Congratulations, you have jut declared that you do not wish to be able to download forms over the internet.

          Woah woah! When did it become impossible to print HTML documents???

          • Re: (Score:3, Insightful)

            Most contracts and many forms require rendering with specific type sizes, specific layouts etc. That isn't currently possible with CSS / HTML, which is why PDF is such an important format to many industries where legal compliance with a national agency, standards body, regulated industry body, or governmental standard is necessary.
            • by fyngyrz (762201)

              Most contracts and many forms require rendering with specific type sizes, specific layouts etc.

              Instead of just stating this, you should be asking, "Why?" Because there is no good reason for it. If you need to reference a particular section, use section numbers. If you need to reference a particular sentence, then you could even number the sentences if you truly believe they are uncountable otherwise. Likewise the illustrations, tables, etc. You see, as it turns out, there is no good reason for rigid do

              • In the insurance industry, contracts and forms must be approved through various agencies. There are laws regarding minimum font sizes and each form must be re-approved each time it is changed. This is done because of regulatory controls and laws.

                The insurance industry is not alone in this regard. In the legal sector, specific page and font formatting is required for many court documents as well as correspondence. You need it to appear as intended, not changed based on the reader's capabilities.

                There are m
              • by drinkypoo (153816)

                Most contracts and many forms require rendering with specific type sizes, specific layouts etc.

                Instead of just stating this, you should be asking, "Why?" Because there is no good reason for it.

                Once you understand that there's no reason to ever have anyone fill out a form for anything ever but greed then questions like this become a big wankoff jerkfest of mental masturbation. The simple truth is that this kind of requirement exists, and software like Adobe Acrobat and formats like PDF have cropped up to fill it. If you want to go live in a dumpster someplace and let someone else meet project requirements, that's cool.

  • by arivanov (12034) on Wednesday December 01, 2010 @02:51AM (#34401156) Homepage

    That is the case with badly done PDFs where pages are rendered as images. PDFs done via the office plugin or Openoffice or any other proper authoring package at the default settings have the text present and the fonts embedded instead so should work fin as far as accessibility.

    How about enforcing some computer literacy on document publishers instead?

    • by robbak (775424) on Wednesday December 01, 2010 @03:31AM (#34401386) Homepage

      Not necessarily. PDF does not preserve text flow. It breaks up paragraphs into lines (or less if kerning has been altered), and places them accurately on the page. If you have a multi-column layout, then a pdf-to-text algorithm (first step in screen reading) is likely to put column-2-line-1 between column-1-lines-{1 and 2}. Best of luck sorting that out.

      • by peppepz (1311345) on Wednesday December 01, 2010 @08:22AM (#34402792)

        Not necessarily. PDF does not preserve text flow. It breaks up paragraphs into lines (or less if kerning has been altered), and places them accurately on the page.

        This is not true. PDF is capable of preserving text flow if the document contains such information. See this as an example [hoboes.com]: if you open it in acrobat reader and move the text cursor using the down arrow, you'll see it travel correctly among columns and paragraphs.
        No page description format will help if the page has been generated in a broken way: for instance, try extracting text from the tables of an html page generated by javascript.

        If you have a multi-column layout, then a pdf-to-text algorithm (first step in screen reading) is likely to put column-2-line-1 between column-1-lines-{1 and 2}. Best of luck sorting that out.

        In this case it is the pdf-to-text algorithm to be broken, and should be fixed.

        • by Taxman415a (863020) on Wednesday December 01, 2010 @10:51AM (#34404006) Homepage Journal

          This is not true. PDF is capable of preserving text flow if the document contains such information.

          Yes, this can be done, but it is almost universally not done. Of all the pdfs out there, almost all of them that have anything but single column text flow incorrectly. The answer is of course to include this information every time, but I don't see how you can mandate that if the standard doesn't include it and most or all current software creates pdfs that don't have it.

          If you have a multi-column layout, then a pdf-to-text algorithm (first step in screen reading) is likely to put column-2-line-1 between column-1-lines-{1 and 2}. Best of luck sorting that out.

          In this case it is the pdf-to-text algorithm to be broken, and should be fixed.

          I'm not sure that you can always figure out the text flow correctly a posteriori. Once the correct text flow information hasn't been encoded in the document, it's a bit of a crap shoot in some cases to figure out what was intended. Where should that floating box go? Many pdfs have text flow broken up so badly that they appear to read randomly. A few bits from one sentence, then a few words or parts from the middle of another paragraph. Literally the best option for some pdfs is to export them as images and import those to an ocr program.

      • PDF supports streams, which -- in the context of text as opposed to audiovisual or other binary streams -- can be individual lines of text or entire paragraphs / columns / pages. The fact that a stream is usually a line of content is a problem in the PDF generation software, not the format per se.
  • Possiblly not a bad thing given the vast amount of security flaws and exploits that PDF has been hit with, especially over the last few years.

  • I really like PDF's ability to retain the font and display of the document without worrying about fonts and the application.
    Since I have to distribute documents that are read on a variety of systems, including Linux, OSX, iPhone/Pad and Windows, PDF really beats all other alternatives in compatibility.

    Adobe should really work on creating a text/image-only version of PDF without their fancy password protecting features and what-not.
    If they don't, perhaps an open source group can take on the challenge.

    • Many applications can already export directly to PDF on exactly the terms you've described, and there are things like CutePDF [cutepdf.com] that will allow you to "print" from any application to a PDF file with a couple of clicks under Windows. On Mac OS X and Linux platforms, you can typically just save any document as a PDF file, at least from most native apps. The capabilities you're describing are already in place, and there's no need to worry about strictly text and image-based docs you've created falling prey to an
  • by whoever57 (658626) on Wednesday December 01, 2010 @03:11AM (#34401270) Journal
    Look at this page [fremontpolice.org]. It's for a local police department in a city that has lots of blind people because of the presence of the California School for the Blind. This is the first page that Google lists for the site. I can't imagine that a screen reader can make anything of the front page and there are no navigation buttons.
    • by Yvan256 (722131)

      No navigation buttons? It's worst than that. Without plug-ins, all you get is a gradient in the background of an otherwise empty page.

    • Re: (Score:2, Funny)

      by noidentity (188756)
      Nonsense! When I visit that site, I see a HUGE button and some normal, selectable text ("Click here to get the plug-in"). A screenreader would do fine with that. Oh, wait...
  • What format (Score:2, Insightful)

    by bigtreeman (565428)

    Missing from the statement is what the preferred format is.

    I would expect a Microsoft format from our illustrious leaders.

    Reads like a fairly dumb statement which is what I always
    expect from our government.

    Sounds like a lead up to them locking themselves (us) into
    using a proprietary, expensive, unusable system.

    Who , me , negative ,
    yep

    • by headLITE (171240)

      Of course a Word document is better suited. So is anything else that preserves the text itself, as opposed to preserving its rendered form. HTML is pretty good for this too. With PDF it can be hard to even figure out where the next word in a sentence is. It doesn't have anything to do with proprietary or not, there are enough free or open formats that work, it's just that PDF is not one of them.

    • Re:What format (Score:4, Insightful)

      by Daniel Dvorkin (106857) on Wednesday December 01, 2010 @04:34AM (#34401710) Homepage Journal

      I would expect a Microsoft format from our illustrious leaders.

      Bingo. Anyone who doesn't see Microsoft's hand in this is hopelessly naive.

  • WTF? (Score:4, Funny)

    by zmollusc (763634) on Wednesday December 01, 2010 @04:31AM (#34401692)

    What does it matter that they can't read the text? PDFs aren't about content, they are about preserving the layout. At least that is what it seems like to me when I am foolish enough to try and read PDFs on a device with a different number of pixels than the person who made the PDF file.
    If the content matters at all, someone should invent a technology that allows text to be tagged somehow with indicators of the MEANING of that portion of text, like 'this is a title', and let the display device render the text according to how the reader can best view it. It sounds crazy, and it may take a few decades to do, but think of the benefits.

    • If the content matters at all, someone should invent a technology that allows text to be tagged somehow with indicators of the MEANING of that portion of text, like 'this is a title', and let the display device render the text according to how the reader can best view it. It sounds crazy, and it may take a few decades to do, but think of the benefits.

      Yes, everyone, this is possibly the richest seam of sarcasm ever discovered on /.

      • by Bigjeff5 (1143585)

        Rich my ass, it's a mediocre jab with an unclear target followed by a solution the target is almost certainly considering.

        Is he making fun of Adobe? Or Australia? If it's Adobe, the initial jab is not sarcasm, it's simply accurate. If the target is the Aussie govt., it's not effectively making fun of their decision, because they are doing exactly what the sarcastic remark suggests before the remark is made, and thus the remark makes no sense at all.

        If by some chance he's actually mocking the decision to

    • by Bigjeff5 (1143585)

      What does it matter that they can't read the text? PDFs aren't about content, they are about preserving the layout.

      Just a guess here, but that is probably the exact reason they don't want government agencies to use PDFs for all their forms.

    • by afidel (530433)
      Selecting Text when viewing a PDF on my Blackberry works ~90% of the time though complex tables like a bill of sale can get munged.
  • The Aussie government failed to recommend a standard that supplants PDF in such a way that it handles all the cases one would expect to handle. So what's the point of this exercise that the OZ gov't did other than basically say without words... 'we should publish everything in XML documents since at least those can be parsed to some degree?

    You know, there should be an industry-standard sheet of paper (Letter/AF) that meets the JAWS difficulty test, much in the same way there are test HTML pages that test w

  • Well, now here's a rich story. A story about lack of accessibility...on Slashdot. Surely this site is highly qualified to criticize others.
  • My reading of this is not so much that there is something inherently wrong with the PDF format itself, but rather with how it is used. If you are a government agency, producing documents for public consumption, you better know how the hell to produce a PDF with searchable, readable text, and not sequester it to image-only. If you can't get that single concept into your head, it won't matter what fucking format you use.you would think bureaucrats, with their stickler for regulation and procedure, would be
    • by ChipMonk (711367)

      you would think bureaucrats, with their stickler for regulation and procedure, would be able to understand that not every PDF is created equal

      You answer your own implied question: being sticklers for regulation and procedure means they don't have to, you know, think about what they're doing.

  • by Bert64 (520050) <bert@noSPam.slashdot.firenzee.com> on Wednesday December 01, 2010 @08:40AM (#34402884) Homepage

    So basically they are saying that *because* it is possible to produce a shoddy PDF file which is basically an image dump, that this is reason enough not to use the format?
    By this same reckoning, you could produce a really shoddy HTML page which also consists of images and no text... Virtually any format could be misused in this way.

    So what's the alternative? That we all revert back to ASCII text since its incapable of holding graphics?

    Personally i hate seeing poorly designed websites or pdf files as i described here, where the text is actually an embedded image (or worse - a flash file) and there is no clickable index etc.
    We should probably start naming and shaming pdf creation software, and those who use (or misuse) such tools.

    • by JohnFen (1641097)

      So basically they are saying that *because* it is possible to produce a shoddy PDF file which is basically an image dump, that this is reason enough not to use the format?

      I think it's more a case of saying that PDFs shouldn't be used inappropriately. If you're producing something which really has to be viewed and/or printed in a visually consistent way analogous to a magazine page, it's hard to beat PDFs. If you're producing something that is to be used in any other way, PDFs blow.

      This has long been my beef with PDFs, this inappropriate use. If the document is intended as a reference, or is text-heavy and intended to be read more than viewed, PDFs and the second-worst choice

      • by Bigjeff5 (1143585)

        Exactly, it's a presentation format - it should be used for presentations.

        It shouldn't be used for documents that need to be used for anything other than presentation.

        It would be nice if PDF were a more all-around document format, but it wasn't designed that way and changing that is difficult at best.

    • by splerdu (187709)

      Ascii art to the rescue!

  • That's the one they choose? It wasn't the gaping security holes, the incessant patch requests (that are never even 6 steps behind the security holes) or the laborious installation/upgrade process? I'm sorry, I know blind people have it tough on the internet, but this is really the dumbest of the reasons I could imagine you would switch away from a nearly universally accepted format.

"It's curtains for you, Mighty Mouse! This gun is so futuristic that even *I* don't know how it works!" -- from Ralph Bakshi's Mighty Mouse

Working...