Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Privacy Security

OCR Software Dev Abbyy Exposes 200,000 Customer Documents (bleepingcomputer.com) 25

A misconfigured MongoDB server belonging to Abbyy, an optical character recognition software developer, allowed public access to customer files. From a report: Independent security researcher Bob Diachenko discovered the database on August 19 hosted on the Amazon Web Services (AWS) cloud platform. It was 142GB in size and it allowed access without the need to log in. The sizeable database included scanned documents of the sensitive kind: contracts, non-disclosure agreements, internal letters, and memos. Included were more than 200,000 files from Abbyy customers who scanned the data and kept it at the ready in the cloud. "Some collection names like 'documentRecognition,' or 'documentXML' hinted that database would be part of a data recognition company infrastructure," Diachenko writes in a blog post today.
This discussion has been archived. No new comments can be posted.

OCR Software Dev Abbyy Exposes 200,000 Customer Documents

Comments Filter:
  • Don't bother keeping anything onsite. Thanks, AWS.

  • No surprise here (Score:4, Insightful)

    by imidan ( 559239 ) on Wednesday August 29, 2018 @05:29PM (#57220886)
    I just assume that any online (cloud based or not) OCR or fax bridge site is going to store a copy of my document in an insecure way. I assume that employees of the service will have access to view my document. I haven't thought too much about them exposing my documents to the public, but it's not a huge step from what I already assumed about them. Anyway, the result is that I don't send anything sensitive or with information I wouldn't want publicly known through online OCR or fax. Because it would be crazy to upload my private sensitive documents to randos on the Internet and assume that they'll never be seen.
    • it would be crazy to upload my private sensitive documents to randos on the Internet and assume that they'll never be seen.

      ?? I thought if you uploaded it to the internet you WANTED it to be seen, that was the whole POINT. Otherwise what's it doing up there?

      Oh, you want security? Keep it directly under your control then and watch it. Better yet, encrypt it at rest and watch out for temp files and bad janitors and evil maids.

      (I thought that a bad janitor forgot to empty the trash and an evil maid put the horse head in the bed in Godfather. Live and learn.)

  • scanned documents of the sensitive kind: contracts, non-disclosure agreements, internal letters, and memos

    I have absolutely zero pity for the companies/people who uploaded such data to abbyy's servers. They perfectly knew what they were doing. You don't store private data unecrypted in the cloud unless you want to share it with the entire world.

  • the DB itself was on the web? and not under some kind of proxy?

  • by ffkom ( 3519199 ) on Wednesday August 29, 2018 @06:26PM (#57221138)
    Notice how those who decided to have the scanning process outsourced to "somewhere in the cloud" will consider this a confirmation of their success. Now all the blame is assigned to Abby, and no blame assigned to them - exactly what they wanted to achieve, nothing less.
  • Couldn't a bunch of AWS customers band together to hire a security researcher to check their permissions? Or even Amazon itself on behalf of their clients?

    Granted, there are issues of what companies want public and what they want private. I'm guessing anything bigger than a gig might trigger a warning, as would anything with personal data.

    Then again, I've never used the cloud for anything more than transferring stuff from my phone to my PC, or vice versa, and have never used AWS. So I have no real
    • Why doesn't Amazon have a message that says "Your data is not protected, are you sure you want to do this?"
    • That's the crazy thing - AWS has the concept of a "VPC", and it has the concept of "public" and "private" subnets inside your VPC. If you put a VM in "private", it won't get an internet IP, and so instantly becomes inaccessible to the Internet. You don't need any fancy reviews or certifications for that - just a modicum of common sense. Hell, even if they'd used their app server as a jumpbox to get to their Mongo server, that would have been better than this.

      This wasn't an issue of an "incorrectly configure

  • My basic assumption for anything being OCR'd effectively free in the cloud through a software provider is that it's not safe. Could be sloppiness (as in this case), could be automated OCR+human verification.

    While I actually do have a couple of Abbyy programs installed (FineScanner Pro and Business Card Reader Pro), I've never actually made serious use of them. On the other hand, I do use Microsoft's Office Lens program which provides much of the same capabilities - but provided under the Office 365 bundling

C'est magnifique, mais ce n'est pas l'Informatique. -- Bosquet [on seeing the IBM 4341]

Working...