There's No Tiananmen Square In the New Chinese Image-Making AI (technologyreview.com) 73
An anonymous reader quotes a report from MIT Technology Review: There's a new text-to-image AI in town. With ERNIE-ViLG, a new AI developed by the Chinese tech company Baidu, you can generate images that capture the cultural specificity of China. It also makes better anime art than DALL-E 2 or other Western image-making AIs. But there are many things -- like Tiananmen Square, the country's second-largest city square and a symbolic political center -- that the AI refuses to show you. When a demo of the software was released in late August, users quickly found that certain words -- both explicit mentions of political leaders' names and words that are potentially controversial only in political contexts -- were labeled as "sensitive" and blocked from generating any result. China's sophisticated system of online censorship, it seems, has extended to the latest trend in AI. It's not rare for similar AIs to limit users from generating certain types of content. DALL-E 2 prohibits sexual content, faces of public figures, or medical treatment images. But the case of ERNIE-ViLG underlines the question of where exactly the line between moderation and political censorship lies.
The ERNIE-ViLG model is part of Wenxin, a large-scale project in natural-language processing from China's leading AI company, Baidu. It was trained on a data set of 145 million image-text pairs and contains 10 billion parameters -- the values that a neural network adjusts as it learns, which the AI uses to discern the subtle differences between concepts and art styles. That means ERNIE-ViLG has a smaller training data set than DALL-E 2 (650 million pairs) and Stable Diffusion (2.3 billion pairs) but more parameters than either one (DALL-E 2 has 3.5 billion parameters and Stable Diffusion has 890 million). Baidu released a demo version on its own platform in late August and then later on Hugging Face, the popular international AI community. The main difference between ERNIE-ViLG and Western models is that the Baidu-developed one understands prompts written in Chinese and is less likely to make mistakes when it comes to culturally specific words.
But ERNIE-ViLG will be defined, as the other models are, by what it allows. Unlike DALL-E 2 or Stable Diffusion, ERNIE-ViLG does not have a published explanation of its content moderation policy, and Baidu declined to comment for this story. When the ERNIE-ViLG demo was first released on Hugging Face, users inputting certain words would receive the message "Sensitive words found. Please enter again (...)," which was a surprisingly honest admission about the filtering mechanism. However, since at least September 12, the message has read "The content entered doesn't meet relevant rules. Please try again after adjusting it. (...)" In a test of the demo by MIT Technology Review, a number of Chinese words were blocked: names of high-profile Chinese political leaders like Xi Jinping and Mao Zedong; terms that can be considered politically sensitive, like "revolution" and "climb walls" (a metaphor for using a VPN service in China); and the name of Baidu's founder and CEO, Yanhong (Robin) Li. While words like "democracy" and "government" themselves are allowed, prompts that combine them with other words, like "democracy Middle East" or "British government," are blocked. Tiananmen Square in Beijing also can't be found in ERNIE-ViLG, likely because of its association with the Tiananmen Massacre, references to which are heavily censored in China. Giada Pistilli, a principal ethicist at Hugging Face, says it could be helpful for the developer of ERNIE-ViLG to release a document explaining the moderation decisions. "Is it censored because it's the law that's telling them to do so? Are they doing that because they believe it's wrong? It always helps to explain our arguments, our choices," says Pistilli.
"Despite the built-in censorship, ERNIE-ViLG will still be an important player in the development of large-scale text-to-image AIs," concludes the report. "The emergence of AI models trained on specific language data sets makes up for some of the limitations of English-based mainstream models. It will particularly help users who need an AI that understands the Chinese language and can generate accurate images accordingly."
"Just as Chinese social media platforms have thrived in spite of rigorous censorship, ERNIE-ViLG and other Chinese AI models may eventually experience the same: they're too useful to give up."
The ERNIE-ViLG model is part of Wenxin, a large-scale project in natural-language processing from China's leading AI company, Baidu. It was trained on a data set of 145 million image-text pairs and contains 10 billion parameters -- the values that a neural network adjusts as it learns, which the AI uses to discern the subtle differences between concepts and art styles. That means ERNIE-ViLG has a smaller training data set than DALL-E 2 (650 million pairs) and Stable Diffusion (2.3 billion pairs) but more parameters than either one (DALL-E 2 has 3.5 billion parameters and Stable Diffusion has 890 million). Baidu released a demo version on its own platform in late August and then later on Hugging Face, the popular international AI community. The main difference between ERNIE-ViLG and Western models is that the Baidu-developed one understands prompts written in Chinese and is less likely to make mistakes when it comes to culturally specific words.
But ERNIE-ViLG will be defined, as the other models are, by what it allows. Unlike DALL-E 2 or Stable Diffusion, ERNIE-ViLG does not have a published explanation of its content moderation policy, and Baidu declined to comment for this story. When the ERNIE-ViLG demo was first released on Hugging Face, users inputting certain words would receive the message "Sensitive words found. Please enter again (...)," which was a surprisingly honest admission about the filtering mechanism. However, since at least September 12, the message has read "The content entered doesn't meet relevant rules. Please try again after adjusting it. (...)" In a test of the demo by MIT Technology Review, a number of Chinese words were blocked: names of high-profile Chinese political leaders like Xi Jinping and Mao Zedong; terms that can be considered politically sensitive, like "revolution" and "climb walls" (a metaphor for using a VPN service in China); and the name of Baidu's founder and CEO, Yanhong (Robin) Li. While words like "democracy" and "government" themselves are allowed, prompts that combine them with other words, like "democracy Middle East" or "British government," are blocked. Tiananmen Square in Beijing also can't be found in ERNIE-ViLG, likely because of its association with the Tiananmen Massacre, references to which are heavily censored in China. Giada Pistilli, a principal ethicist at Hugging Face, says it could be helpful for the developer of ERNIE-ViLG to release a document explaining the moderation decisions. "Is it censored because it's the law that's telling them to do so? Are they doing that because they believe it's wrong? It always helps to explain our arguments, our choices," says Pistilli.
"Despite the built-in censorship, ERNIE-ViLG will still be an important player in the development of large-scale text-to-image AIs," concludes the report. "The emergence of AI models trained on specific language data sets makes up for some of the limitations of English-based mainstream models. It will particularly help users who need an AI that understands the Chinese language and can generate accurate images accordingly."
"Just as Chinese social media platforms have thrived in spite of rigorous censorship, ERNIE-ViLG and other Chinese AI models may eventually experience the same: they're too useful to give up."
Sensitive words found (Score:5, Funny)
But there are many things -- like Tiananmen Square, the country's second-largest city square and a symbolic political center -- that the AI refuses to show you.
I guess the Hundred Acre Wood is out too then.
This will get slashdot blocked (Score:2)
Tank Man, Tank Man, Tank Man!
Re: (Score:2)
Two half gallon jugs of homo erectus.
Re: This will get slashdot blocked (Score:2)
Sounds a lot like Taiwan
Re: (Score:2)
We believe you, you're not Chinese.
Re: (Score:2)
Does it show an actual passenger airliner crashing into the pentagon instead of some drone-like much smaller "something" crashing at ridiculously low altitude yet dead on target?
You know a cruise missile can fly through a window right? Why would you think that it's difficult for "something" to approach at low altitude and then hit its target? Bought your cruise missile from aliexpress?
Re: (Score:3)
Bought your cruise missile from aliexpress?
Sure, it takes ages to ship and the quality is a gamble, but at those prices it's worth the risk.
Just don't order an ICBM. The way they keep their shipping costs down ...
I wonder what happens (Score:2)
Re: (Score:3)
google returns 90% of the internet
Re: I wonder what happens (Score:2)
Re: (Score:2)
I don't think its all THAT strange, my parents live 2 towns and ~20 min away, but while they have all the same stores and services we do, they live just on the line of a small metropolitin city that has a 1000x more variation than us.
So if I search for shitbag I get tractor supply or lowes, if I search shitbag from their house I get accident and workmens comp lawers
Re: I wonder what happens (Score:2)
Re: (Score:2)
I tried it in DreamStudio.
The results were... less than stellar.
https://beta.dreamstudio.ai/dr... [dreamstudio.ai]
Re: (Score:2)
Unfortunately, that link doesn't work, but thanks for the recommendation of one of these AI image engines I had not previously heard of.
Re: (Score:2)
Weird, I have been using it for quite some time, as a matter of fact I had some fun with it minutes ago.
(I should have mentioned it indeed requires logging in, which is expected to prevent abuse)
Re: (Score:2)
A captcha would work just as well, but then they couldn't harvest your email address...
Pooh (Score:2)
"The pooh bear costs $0.50."
I wonder what kind of image that will turn up.
Instead ask for a (Score:1)
Tienanmen heptagon.
CCP ain't too bright about certain things. (Score:5, Interesting)
I'm a little puzzled how foreign the notion of co-option is to such soulless cynics. I would strongly bet the TS massacre being an imposed taboo has led to its perpetuation as a subject far more than would be the case if they just built a monument to Tank Man and bored school children with lectures about it once a year.
Re: (Score:2)
Pretty much this. Dump every kind of celebration and then some on that square and make sure that any search results for it come up with your propaganda pieces first so someone who really wants to see anything about the "problematic" stuff would first have to dig through mountains of your propaganda trash.
What? Did it work for the Christians when they redressed old faith customs or didn't it?
Re: (Score:2)
If it was me trying to block access to information on this incident, I would flood the search results for Tiananmen Square Massacre with information about a European Football match between the Tiananmen Square Titans and the Capitalist Pigs where the TST won by 10 to 0, it was a total massacre!
Disinformation would work to mask the truth pretty well too, then when people search for information when they hear the term, they find that it is something that isn't worth worrying too much about.
Also, blocking thin
Re: (Score:3)
Just do a shitload of stuff on the square. Make it the new entertainment district and have people do all sorts of celebrations there so the internet is flooded with various things that you WANT people to see about that square. It's way easier to squelch information you don't want to see if there's a metric ton of dung you can pile on top of it.
Re: (Score:3)
If you actually go to China and talk to Chinese people, you will find that most of them haven't heard of the Tiananmen Square protests/massacre, or Tank Man. There is no external interpretation, it's blocked by the Great Firewall. When you bypass it and show them the Chinese Wikipedia article on that event, they think you are one of those foreign bad guys that the government warned them about, trying to stir up trouble.
The CCP has it locked down hard.
Re: (Score:2)
Re: (Score:3)
I saw that too. That reluctance is because the government has warned them that foreign journalists like to spread lies. The moment you say "Tiananmen Square massacre" they realize you are trying to tell them something bad about their government and don't want to talk to you anymore.
Same with terms like "forced labour" and "re-education camps". The recognition is the attempt to "smear" the government, not the event.
Re: (Score:2)
Also, being forgotten doesn't require staying forgotten. The imposed ignorance of one generation is no barrier to educating the next.
Re: (Score:2)
Outside of Beijing, this is true. Inside of Beijing, it is known and discussed. Too many people actually saw what happened from their windows overlooking the square to completely "make it go away".
Best,
Re: (Score:2)
It's pointless to lie when we both know you're lying.
Does this link work where you are? [history.com]
Reporters and Western diplomats on the scene estimated that at least 300, and perhaps thousands, of the protesters had been killed and as many as 10,000 were arrested.
But nobody saw anything. Sure.
Re: (Score:2)
I'm pretty sure a lot of that is simply due to not wanting to talk to some stranger who you don't know about something that may land you in hot water if the wrong ears hear that you talk about it.
Re: (Score:2)
In which case, why prevent people making pictures of it? Surely they wouldn't know to superimpose some tanks and students on a picture of the square? Even if one free-thinking individual did it, surely "the masses" would call it out as "fake news".
They're blocking it because their people most definitely do know about it. They're also banking that the vast majority of those people will never actually visit the place themselves and take a few snaps of it.
As a foreigner there, talking about the event is indeed
Re: (Score:2)
It may well just be that the government has a standard list of things that must be blocked. "Whatever your service is, if it takes user input you must block these things."
Might not even be a government list, maybe just a popular list of bad things. Western developers use them all the time, they just google "offensive word list" and use whatever comes up.
Agreed that the Lockdown Mostly Worked (Score:2)
The CCP has it locked down hard.
This aired over 16 years ago, so I assume things have gotten worse since:
https://www.youtube.com/watch?... [youtube.com]
Summary: A reporter shows four middle (upper?) class Chinese students the tankman photo, but only one has the vaguest idea what it might be.
Sadly, these are the Chinese youth with the most access to travel and (likely) information, too. You could argue they're not naive and they understand they need to play dumb, but the reporter doesn't think that's what's happnening.
I haven't seen the BB
Re: (Score:2)
If you actually go to China and talk to Chinese people, you will find that most of them haven't heard of the Tiananmen Square protests/massacre, or Tank Man
It depends on the age group. People who were college age at the time remember, and there were protests all across China. People in younger age groups grew up watching state TV. People in the older age group wish Mao were still alive to make things better.
A lot of people who supported the protests in those days have changed their mind. Over the last decade or so, they have become convinced that authoritarianism is more efficient, sometimes encouraged by western journalists.
Filtering (Score:3)
Some years ago, I built a chat system for a major Australian bank (which bank? Don't ask). Anyway, they sent an official list of words they deemed unacceptable. This was an actual piece of paper (it was a while ago) with a list of rude words on bank headed paper. Hilarious.
The filtering wasn't so hard, except that oddly "Dick Smith" - an electronics company, was acceptable, but "Dick" was not. Made for slightly interesting programming. I think I just **** the offending word.
Sadly, I no longer have this piece of paper. I should have framed it.
Rewriting history (Score:2)
This is just the start of it. Give it a generation or two and nothing ever happened at Tiananmen Square which is just the name of a sleazy strip mall in China.
Re: (Score:1)
Denial doesn't make reality go away.
Re: Rewriting history (Score:2)
Sir, please follow the orange line to room 120 for your prescribed reeducation as directed by the State.
And this is news? (Score:2)
I bet there is no Taiwan (as R.o.C.) too.
Re: (Score:2)
... or dancing Mao [artnet.com]
Ok so they filter "Xi Jinping" from the input (Score:1)
Re: (Score:2)
Yeah it's just a matter of phrasing. Try "students being run over by a tank on a large square"...
Unstable diffusion (Score:1)
Censorship is bad, but if I had to choose between an AI maimed into not showing a handful of political topics from the other side of the world, vs an AI maimed into not showing anything unsuitable for children, I'm going with the former any time.
https://i.imgur.com/hZ1r6rv.pn... [imgur.com]
On this, China is freer than the west. Fortunately unstable diffusion exists.
Re: (Score:2)
Censorship is bad, but if I had to choose between an AI maimed into not showing a handful of political topics from the other side of the world, vs an AI maimed into not showing anything unsuitable for children, I'm going with the former any time.
And then you post the image to a fake-ass website that doesn't function without javascript so it can spy on people. I'm going with the "not allowing scripts from websites that could function just fine without them" personally
Re: (Score:1)
Here's a better link: https://i.imgur.com/hZ1r6rv.pn... [imgur.com]
Re: (Score:1)
Wait a second, what are you even on about? At first I thought I messed up and linked to the toplevel imgur page that contains scripting, but nope, I am directly linking the .png.
Re: (Score:2)
When I click it, I just get redirected to https://imgur.com/hZ1r6rv [imgur.com] because imgur doesn't allow people who block scripts to view images. It's a very poor match for Slashdot.
Re: (Score:1)
Live and learn.
Here's a different link: https://a.uguu.se/DNsrbLPQ.png [a.uguu.se]
Re: (Score:2)
heh heh, Firefox rejected that one "Error code: MOZILLA_PKIX_ERROR_REQUIRED_TLS_FEATURE_MISSING"
I wonder if this behavior makes sense or not. looks like it defaults on in most browsers though [digicert.com].
I have bad news for China (Score:3)
Re: (Score:2)
They don't care about that, that's what the great firewall is for. As long as they can keep their own people from hearing about it, and they largely have succeeded at doing that, then they don't have to worry about anyone else becoming emboldened and opposing them in the same fashion.
Re: (Score:2)
I wouldn't be so sure. They will market this AI as being superior because the customer can configure it to block things they don't like.
Right now in the UK the police are harassing and arresting people who protest the monarchy and members of the royal family. These are people calling for full democracy and removal of the monarch as head of state. I'm sure the government would love to have some AI that replaced the messages they are displaying with support for the new king, when the queen's funeral takes pla
"Tianenmen Square 1989" is a magical incantation (Score:2)
"Tiananmen Square 1989" is a magical incantation. When you say it to Chinese spammers they mysteriously disappear. Thanks, Great Firewall of China!
Re: (Score:2)
Are you truly so stupid as to think Chinese spammers can't us a VPN?
Why would they do that? They're not doing anything illegal in China by spamming us everywhere. Shit, there are fucking Chinese spammers in GTAV. Who the fuck is buying shit advertised in GTAV spam?
I've heard this song before (Score:2)
"The purpose of Newspeak was not only to provide a medium of expression for the world-view and mental habits proper to the devotees of Ingsoc, but to make all other modes of thought impossible." - George Orwell, The Principles of Newspeak
Why'd they give it a Western name? (Score:2)
what we need is an AI ... (Score:2)
... that can make actual summaries
Anagrams don't lie! (Score:2)
ERNIE-ViLG --> e-EvilGrin
Re: (Score:2)
REIGN-ViLE /s
Mmm, censorware. (Score:2)
Western AIs are censored as well (Score:1)
This makes it seem as though this is unusual. Western AIs are censored as well, for names, body parts, and certain themes. Whether we want to think of these themes as political or not, they certainly represent a very specific brand of morality of the United States. In that sense, how is what China is doing any different, except that (for good reason) we don't agree with the actions of the Chinese regime on certain issues? The fact is there is very rigorous censorship at work in Dall-E, Stable Diffusion and
Say! (Score:2)