US Intelligence Seeks a Universal Translator For Text Search In Any Language (arstechnica.com) 47
An anonymous reader quotes a report from Ars Technica: The Intelligence Advanced Research Projects Agency (IARPA), the U.S. Intelligence Community's own science and technology research arm, has announced it is seeking contenders for a program to develop what amounts to the ultimate Google Translator. IARPA's Machine Translation for English Retrieval of Information in Any Language (MATERIAL) program intends to provide researchers and analysts with a tool to search for documents in their field of concern in any of the more than 7,000 languages spoken worldwide. The specific goal, according to IARPA's announcement, is an "'English-in, English-out' information retrieval system that, given a domain-sensitive English query, will retrieve relevant data from a large multilingual repository and display the retrieved information in English as query-biased summaries." Users would be able to search vast numbers of documents with a two-part query: the first giving the "domain" of the search in terms of what sort of information they are seeking (for example, "Government," "Science," or "Health") and the second an English word or phrase describing the information sought (the examples given in the announcement were "zika virus" and "Asperger's syndrome"). The system would be used in situations like natural disasters or military interventions in remote locations where the military has little or no local language expertise. Those taking on the MATERIAL program will be given access to a limited set of machine translation and automatic speech recognition training data from multiple languages "to enable performers to learn how to quickly adapt their methods to a wide variety of materials in various genres and domains," the announcement explained. "As the program progresses, performers will apply and adapt these methods in increasingly shortened time frames to new languages... Since language-independent approaches with quick ramp up time are sought, foreign language expertise in the languages of the program is not expected." The good news for the broader linguistics and technology world is that IARPA expects the teams competing on MATERIAL to publicly publish their research. If successful, this moonshot for translation could radically change how accessible materials in many languages are to the rest of the world.
Oxymoron (Score:2, Funny)
Re: (Score:2)
Easy to defeat... (Score:2)
Here's how:
How about writing, "The sheep are coming..."
...And this to mean something entirely different in the bad guys' minds?
Easy and effective. Isn't it?
Re: (Score:1)
Re: (Score:2)
In a Desmond Bagley spy thriller the automatic translator turned "hydraulic ram" into "water sheep".
the how and the why are unrelated (Score:2)
They expect people to publish research into how to take some English search terms and then search a pile of assorted documents in different languages. The public can see (some of) HOW one can search text. So we get to see some ideas about searching general text.
Which text they later search, for what reasons, is a completely separate issue. If they can get a system like this developed, they would be foolish if they didn't use it in their national security mission. In fact, most intelligence is from open sour
Re: (Score:2)
Will it be able to give meaning to poorly-translated newsfeeds like the ones this slashdot contributor's history [slashdot.org]?
Sample:
"Various framerates have been a warm theme before few years?"
It gets worse from there.
Different framerates have been a hot topic in recent years.
That'll be .001 BTC.
Re: (Score:2)
Different framerates have been a hot topic in recent years.
On Slashdot, different flamerates have been a hot topic in recent years.
Don't think that'll work (Score:2)
Re: (Score:3)
I don't speak but a handful of words in a very short list of languages, I'm certainly no expert in language, but aren't there some languages that are so nuanced that a slight change in inflection, or tone, or emphasis, or maybe even cadence changes the entire meaning of what's being said? Wouldn't that be rather difficult to code for?
Two things...
1. I think they are thinking about a computer database of text not audible data base.
2. Nobody really technically "codes" this stuff anymore, a deep learning networks is conceived and configured in a framework and then trained with petabytes of data.
yo, Dawg (Score:1)
Isn't that...Google? (Score:4, Funny)
FTFY (Score:4, Insightful)
If successful, this moonshot for translation could radically change how accessible materials in many languages are to the rest of the English speaking world.
This will end in tears (Score:1)
We need only to look at the BIBLE to see what happened the LAST TIME someone tried to create a Tower of Babel to see what will happen THIS time.
Re: (Score:2)
Yeah, but I'm pretty sure the machines are going to be rackmounts and/or blade servers. No one uses towers anymore.
Re:Babel (Score:1)
Re: (Score:2)
I would like to apply for the job (Score:4, Insightful)
Dear Sir,
My name is Mahindresh Jalabahamatra* from India. I would like to apply for the Universal Translator job that you are offering. I am very skilled in Universal Translation and have many years of experience. I have done Universal Translation for many clients in the past, and I consider your offered job as Universal Translator to fit my skills perfectly.
Hoping to hear from you soon.
Re: (Score:2)
Re: (Score:1)
Dear Sir,
My name is Mahindresh Jalabahamatra* from India. I would like to apply for the Universal Translator job that you are offering. I am very skilled in Universal Translation and have many years of experience. I have done Universal Translation for many clients in the past, and I consider your offered job as Universal Translator to fit my skills perfectly.
Hoping to hear from you soon.
I'm Indian and I find this absolutely hilarious.. Should I be offended? Naa.. I've seen too many applicants use similar language... I will however laugh uncontrollably for the next 5 minutes at that name... Mahindresh Jalabahamatra
Re: (Score:2)
I propose ... (Score:1)
... Sheldon Cooper.
Bound to failure in natural context (Score:1)
Re:Much better actually (Score:1)
Most difficult part done (Score:2)
Text search has limitations WITHIN a language. (Score:2)
As you've no doubt experienced when you've done a Google Search on a word which has multiple meanings. For example, suppose you google "How do I get rid of a mole?" Are you worried about a skin condition or a small burrowing mammal? It so happens that Google tries to give you a mix of both answers, which I suspect may reflect the result of some ad hoc result tweaking.
So you do sometimes have to know how to rephrase a query, e.g. "pictures of a flying crane" to "pictures of an aerial crane".
The problem i
Lol. no, Google does it better than you, a human (Score:2)
> Dor example, suppose you google "How do I get rid of a mole?" Are you worried about a skin condition or a small burrowing mammal?
No, Google would already know that neither of those interpretations is correct. Google tracks your search history, it knows who is asking. So when the CIA asks how to get rid of a mole, Google knows they are talking about a https://en.m.wikipedia.org/wik... [wikipedia.org] mole.
Re: (Score:3)
"Out of sight, out of mind."
Translation:
"Invisible idiot."
Mmmm..k? Code: (Score:1)
Re: Mmmm..k? Code: (Score:1)
Cool (Score:2)
Now they'll drone murder us based on what an algorithm mistranslated.
If at least a bilingual murderer had to listen to the xbox record of us joking in the living room our chances would be higher.