Forgot your password?
typodupeerror
The Internet Censorship

Russia Weighs Going Cyrillic For DNS 223

Posted by kdawson
from the ru-serious dept.
An anonymous reader writes "The Guardian reports that the Kremlin may start an alternate top-level domain, .rf. According to the story, .ru in Cyrillic translates to .py, the top-level domain for Paraguay, which the Russian government claims leads to confusion. This is similar to a move by China, which has their own .net and .com top-level domains in their native character set along with .cn, .com, and .net in ASCII." Hindering Paraguayan hackers may matter less to the Russian government than establishing greater control over a walled-off Internet.
This discussion has been archived. No new comments can be posted.

Russia Weighs Going Cyrillic For DNS

Comments Filter:
  • Great!!! (Score:2, Interesting)

    by Anonymous Coward
    It's great that nations can use their own languages instead of being forced to use alien Latin-English characters.
    • Re: (Score:2, Informative)

      by AKAImBatman (238306)

      It's great that nations can use their own languages instead of being forced to use alien Latin-English characters.

      In this case, the characters are exactly the same. It's just that 'p' (pronounced 'pee' in English) is the letter 'er' in Russian, and 'y' (pronouced 'why' in English) is the letter 'oo' in Russian. So .ru to us is literally .py to them.

      Cyrillics has a number of Greek letters sprinkled in, but in this instance it is of no help.

      • Re: (Score:3, Interesting)

        by Arthur B. (806360)
        they are not the same, they just look very similar
          != py
        • I don't think you understand what I mean. I'm not saying that they are the same computer code. I'm saying that they are literally the same characters. Just used differently between Cyrillics and English. The fact that computers have different character codes for the languages is beside the point. In an international environment py is going to equal py. Which can create a bit of a problem. Did I just receive a legitimate email from bankofrussia.py [google.com] or a phishing attempt from bankofrussia.py [google.com]?

          Can you tell the d
          • Re: (Score:2, Informative)

            by Arthur B. (806360)
            The characters are not displayed in the same way, I cannot paste cyrillic in slashdot, but the difference between the y and the russian u is visible, the tail of the y is rounder.

            Of course, it leaves room for phishing attack, but they are not the same character. Not historically, not linguistically, not in encoding, not in display.
            • Re: (Score:3, Informative)

              by AKAImBatman (238306)

              The characters are not displayed in the same way

              As I said, it depends on your font. In Arial, they are pixel for pixel. In Courier, they have slightly different shapes. Either way, it doesn't really matter. Very few people will notice the font differences. Why? Because they are the same characters. The fact that a computer provides two copies of the same character, actually causes as many problems as it solves.
              • Re: (Score:2, Insightful)

                by oldhack (1037484)
                You guys are failing to communicate because you have different premises. Batman defines character by the appearance, Arthur by its semantic (as does Unicode). Semantic definition is clearer than the visual one, especially since the appearance of the same character varies depending on the font used. The possible problem due to similar appearances remains, although I don't know how big of a problem it is/will become.
            • Re: (Score:3, Interesting)

              by AKAImBatman (238306)

              they are not the same character. Not historically

              And yes, they are the same character, historically speaking. Both characters were borrowed from a common Greek/Semitic ancestry. Cross pollination of Latin and Cyrillic languages have lead to Cyrillic renderings of the letter that are more or less the same as the Latin rendering.

              http://en.wikipedia.org/wiki/Y [wikipedia.org]
              http://en.wikipedia.org/wiki/%D0%A3 [wikipedia.org]

              http://en.wikipedia.org/wiki/P [wikipedia.org]
              http://en.wikipedia.org/wiki/%D0%A0 [wikipedia.org]

              • by Arthur B. (806360)
                The fact that they share ancestry does not mean they are historically the same character. Historically French and Spanish have always been distinct, although they both come from latin.
          • by Cyberax (705495)
            BTW, it's amusing that you translated the word 'bank' as 'kren'. It means 'list, careen, roll', and not 'place where you put money' :)
            • I was actually wondering about that. I'm well aware that "Bank" is "baHK" (bahnk) in Russian, but Babelfish insisted that Kren Russee was a proper translation for "the bank of Russia". I wonder which "bank" it was thinking of. Stupid Babelfish. :-/
          • by alexhs (877055)
            Can we all agree that they have similar glyphs [wikipedia.org] but are different characters / graphemes [wikipedia.org] ?
            • Y and Y are fully equal in the historical sense. They derive from the same letter, and even convey the same sound in their respective languages. English overloads Y with a few different allographs, but pronouncing Y as "oo" is a common form.

              P and P are different graphemes due to the evolution of the modern glyph by way of the Latin language. The Romans had already evolved a P symbol from Greek/Semitics, so when faced with P "Er" they chose to add a line to differentiate it; thus forming the modern letter R.
              • Actually, to correct myself, I'm confusing Y with U. (I don't know what the heck I'm smoking on that one.) U gets overloaded with an "oo" sound. Y is overloaded with an "ee" sound. Both U and Y have a common ancestry, having evolved from the same character (Upsilon). So Y and P sort have the same compatibility status.

                Sorry about the brainfart.
        • Re:Great!!! (Score:4, Informative)

          by Maimun (631984) on Thursday January 03, 2008 @03:27PM (#21899060)
          They ARE the same. Trust me, I am Bulgarian and we also use the Cyrillic alphabet. The Cyrillic alphabet was created in the 9th century by Constantine, a Byzantine friar (I dunno if this is the correct term) serving the emperor in Constantinopol. The church name of Constatine was Cyrill, that is where the name of the alphabet came from. At that time, both Rome and Constantinopol were trying to convert the Slavic states to Christianity. The Eastern Roman Empire, a.k.a. Byzantia, was more flexible than the Catholics: she offered Christianity in the native Slavic languages, while the Catholics insisted on using Latin. The Cyrillic alphabet was introduced precisely for that purpose. It was modified Greek alphabet (Greek was, of course, was the language of the East Roman Empire) with symbols added for those Slavic sounds that had no Greek equivalent. Intially it was adopted in Bulgaria and after about a century or two it was adopted by the Russian proto-state -- in contrast to the Russian myths that the Cyrillic alphabet was first introduced in Russia and even invented in Russia.

          The initial Cyrillic alphabet looked quite different from what is used today in Russia and Bulgaria; the appearance of the modern Cyrillic alphabet is due to a reform by Tzar Peter I of Russia. Peter I imposed visual style similar to the one of the Roman font.

          BTW, the Cyrillic alphabet was not the only creation of Constantine-Cyrill. He had invented another alphabet to be used by the Slavs which was called "glagolitsa" and visually was totally different from the Cyrillic one. This radical design was not very successful, although I've heard it had been used in Croatia until 2-3 centuries ago.

          Here is a four-column table of the original Cyrillic alphabet [wikimedia.org] and the Glagolic one ("glagolitsa"). The first column is the name of each letter (yes, each one had a name; if the names are read sequentially they form a saying, quite deep and meaningful at that), the second is the cyrillic glyph, the third is the glagolic glyph, the fourth is the numeric value.

          • by Arthur B. (806360)
            The initial Cyrillic alphabet looked quite different from what is used today in Russia and Bulgaria; the appearance of the modern Cyrillic alphabet is due to a reform by Tzar Peter I of Russia. Peter I imposed visual style similar to the one of the Roman font.

            You just proved my point. If he needed to impose a visual style *similar* to the one of the Roman font, it means precisely that the characters are different. A character is more than it's different representation with different fonts, a character is
            • by Maimun (631984)
              I hardly proved your point. You said:

              The characters are not displayed in the same way, I cannot paste cyrillic in slashdot, but the difference between the y and the russian u is visible, the tail of the y is rounder.

              As someone said, that depends on the font you use. In the fonts I use most of the glyphs are the same pixel for pixel with their visual counterparts. Though they maybe treated differently by the displaying system. I mean the spaces between them may be (very slightly) different.

          • Re: (Score:3, Interesting)

            I live in Poland, more specifically in Przemysl on the Ukrainian border so I'm exposed to both alphabets more or less daily. I must confess, I envy the Easterners! The Latin alphabet is really not suited to Slavic tongues and I think the Cyrillic one is a far superior way to render them. For example, in Cyrillic you get one nice little letter looking like w with a tail, whereas we get szcz... if you're an English speaker, it'd be something like the sh ch between freSH CHeese. Anyway, the inadequacies of the
          • Re: (Score:3, Interesting)

            by CRCulver (715279)
            Sts Cyril and Methodius did not invent the Cyrillic alphabet. They invented only the Glagolitic alphabet. The Cyrillic alphabet was invented in the Kingdom of Bulgaria nearly a century later.
      • Re: (Score:2, Informative)

        by Anonymous Coward
        No, the characters only look the same to a human eye. To a computer they would look quite different:

        English "py" is keycode U+0070, U+0079
        Russian "py" is keycode U+0440, U+0443

        Of course, the whole internationalization issue wouldn't be an issue if ICANN didn't have their head up their collective ass.
        • Re:Great!!! (Score:5, Interesting)

          by Maimun (631984) on Thursday January 03, 2008 @03:38PM (#21899248)

          No, the characters only look the same to a human eye. To a computer they would look quite different:
          This is precisely why Cyrillic symbols are not used in DNS. It is possible to have two URLs, one having latin letters only, the other one latin and cyrillic, that look exactly the same in most fonts but are completely different as strings, so if they are resolved by DNS they'd resolve to distinct IP addresses. This is just perfect for phishing attacks: you can't tell whether www.mybank.com is the URL of your bank "MyBank", or it has a Cyrillic "a" and is registered by the attacker, by simply lookong at it. To tell if the URL is genuine one must examine it with hex editor ro something...
        • by zsau (266209)
          Have you been to countries that use Cyrillic alphabets? I have. I frequently saw URLs written in Cyrillic, even though they were meant to be transliterated into Latin. So someone advertising "bank.kg" (I was in Kyrgyzstan) would write (fancy looking b)a(small captial h)(small capital k).(small capital k)(small capital gamma). They might have written the URL on the computer, but when it's a poster the character codes aren't there any more, it's just ink. I can understand the confusion that would come in Russ
    • Re:Great!!! (Score:5, Interesting)

      by Sigismundo (192183) on Thursday January 03, 2008 @02:27PM (#21898042)

      Not sure why the parent has been modded flamebait. It's probably the phrase "alien Latin-English characters", but it's actually an accurate description of how a domain name might appear to speakers of non-European languages.

      I wasn't aware that China had already began experimenting with Chinese characters in domain names, so I did some Googling. Here is a link [cnnic.cn] (in English) that describes how to register a Chinese Domain Name (CDN). It makes for a pretty interesting read. It includes the predictable clause that you can't register CDNs that "harm the glory of the state." Users of CDNs are encouraged to use "Official Client-end CDN Software" to make access more convenient. I wonder exactly what this does.

      In general I think it's pretty cool to be able to have non-ASCII characters in domain names, but it seems to introduce a lot of extra compexity into DNS. Also, it seems like it could open the door for more governmental control of the internet, as TFA mentions.

      • I take it there is some good reason why a new but backwards-compatible version of DNS can't be released that uses unicode? Never mind Cyrillic, or Chinese characters, I want a domain name in Tengwar!
        • You mean, something like the '96-proposed IDN <http://en.wikipedia.org/wiki/Internationalized_domain_name [wikipedia.org]>?
        • by saforrest (184929)
          I take it there is some good reason why a new but backwards-compatible version of DNS can't be released that uses unicode? Never mind Cyrillic, or Chinese characters, I want a domain name in Tengwar!

          And you can get it [wikipedia.org] if you can get Tengwar in Unicode, with the exception of the top-level domain. Unicode characters are already supported and used [b], but no top-level domains using non-Latin Unicode characters yet exist. Russia is proposing a new top-level domain.

          Thinking about it, there's no need to reinvent t
          • by saforrest (184929)
            Unicode characters are already supported and used, but no top-level domains using non-Latin Unicode characters yet exist

            Apparently top-level domains support Unicode characters in URLs, but Slashdot chokes on them! (In links, anyway). Here are some attempts, all failing:

            bücher.de [b] (UTF-8 or ISO 8859-1)

            bücher.de [buumlcher.de] (HTML entity u-uml)

            bücher.de [b] (Unicode character 00FC as entity)

  • by mr_mischief (456295) on Thursday January 03, 2008 @02:04PM (#21897528) Journal
    You can't really translate between 'r' and rho. It's a character set issue. It's a straight equivalency of sounds. Cyrillic is based on the Greek alphabet and the English alphabet is based on the Latin alphabet. It could be confused with Paraguay because of the character encoding, but it's not really the same letters.
    • by _|()|\| (159991)
      From the user standpoint, it's a distinction without a difference. In most fonts, Latin "py" is not readily distinguishable from Cyrillic "ru." However, I would argue that confusion is more likely with the proposed Cyrillic domain names than with the current all-ASCII system. I am sympathetic to the desire for more localization, but the ramifications of a change like this should be considered very carefully.
      • If there's no difference in the words we use, then we should stick to just one word. I propose "Oof". ;-)

        Seriously, though, I think you've struck on the right issue here. It's not a problem caused by the current system. It's a problem encountered when expanding the current system to include other character sets. For the people designing the DNS had considered this change way back when they designed the initial system and assigned the ccTLDs, it would have been nice but would've required an extreme case of f
    • by forkazoo (138186)

      You can't really translate between 'r' and rho. It's a character set issue. It's a straight equivalency of sounds. Cyrillic is based on the Greek alphabet and the English alphabet is based on the Latin alphabet. It could be confused with Paraguay because of the character encoding, but it's not really the same letters.

      Well, it's both wrong, and not wrong. "Translation" is often used in a very broad sense to say things like source code is translated into an executable form by a compiler. Programmers who wor

  • by savuporo (658486) on Thursday January 03, 2008 @02:04PM (#21897532)
    i think this is a specially engineered news post to bring out the lamest "in soviet russia" jokes of slashdot. bring it on!
  • by trolltalk.com (1108067) on Thursday January 03, 2008 @02:08PM (#21897598) Homepage Journal

    In Soviet Russia, DNS blocks YOU.

    ... which is the whole point of "greater control".

  • Well... (Score:2, Funny)

    by gibbdog (551209)
    In Soviet Russia, the domains name you!
  • by edwardpickman (965122) on Thursday January 03, 2008 @02:13PM (#21897692)
    and prevent foreign outsourcing of Russian web site construction they plan to launch a version of HTML in Cyrillic. Soon to be followed by C++ in Cyrillic. Microsoft decided it was a niffty idea so they plan to start a Pig Latin based coding language called "Squeal Like".
    • by Animats (122034)

      HTML in Cyrillic...

      You can write XML with Cyrillic tags. XML with tags in Mandarin Chinese shows up now and then.

    • Re: (Score:3, Interesting)

      by techpawn (969834)
      This is why we need "common" as a language choice! Go ahead and keep your individual languages (English, French, Goblin) but also have a "Common" language for all people. Like in Firefly everyone spoke a little English and a little Chinese to create a language of the people...

      I fear that it would create more and bloodier Wars than ever before though.
    • by megaditto (982598)
      Funny you should mention outsourcing. Last time I checked, Russia annually issues about 7,000,000 work visas for foreigners.

      Now compare that to all the bitching about 60k H1Bs...
    • Soon to be followed by C++ in Cyrillic.

      When we studied programming in high school, we used a language called "Ershov" (last name of the textbook's author), which was really Pascal translated to Russian.

      I don't think, there was an actual compiler, though — nor did we have (enough) computers. Our little code-snippets were checked by the teacher by hand...

      "One laptop per child"? Right...

      In the American college, our professor was quite fond of (then brand new) Java. Among the advantages, he listed

    • by Cyberax (705495)
      Actually, the most-used programming language in Russia is the language of 1C:Enterprise (http://en.wikipedia.org/wiki/1C_Company), it has Russian keywords and system variable names - it's the only sane way, because some terms of Russian accounting do not translate well into English (and transliterated Russian is _ugly_).

      Though, Russian text in computer programs looks very weird.
  • to put a wall up across the internet.
    Also the reason I do not want changes to how the internet 'works'.
    It seems every change someone comes up with is designed to put a wall up someplace.

    • by kwerle (39371)
      How?
      It is just a nameservice. If russia decides they want a top level .manyspecialcharacters, google will buy/register the domain name google.manyspecialcharacters, just like they bought/registered google.ru. Russia will get some money, and everyone is happy - especially Russia. You can still call it google.com, or just 72.14.207.99.
      • by rs79 (71822)
        Here's why it won't work. Note that alternative top level domains have been around for a *cough*decade*cough* while. And were making progress, that is, reasonable people, ISP's (earthlink) and companies (GE, etc...) used them. Not because they were cool or had new names, that was sort of icing, but because they were faster.

        But, this didn't sit well with some folks; here's what they did.

        Enter the "transparent" proxy cache.

        In a true end to end internet you type, say, yahoo.com into the browser address bar, yo
        • by kwerle (39371)
          So it'll work fine for the folks who want it to work most: people in Russia, with Russian ISPs.

          It turns out it will also work for me. I set my DNSServer to be someone who does opennic resolving. I was immediately able to visit http://www.opennic.glue/ [opennic.glue] You're right that I would not be able to send them email without configuring a different smart mailserver.

          But I don't think it is reasonable to say that the whole thing won't work because your ISP sucks and transparent proxies you.
  • How long? (Score:5, Funny)

    by A beautiful mind (821714) on Thursday January 03, 2008 @02:17PM (#21897798)
    How long until someon registers rm.rf ?
    • Re: (Score:3, Funny)

      by sootman (158191)
      My first thought was 'tm.rf'--in Soviet Russia, The Manual, um, Reads... wait...
    • Re: (Score:3, Insightful)

      by megaditto (982598)
      Bah, I can think up some that are way cooler. Let's see here:

      rt.fm
      poop-s.coop (a real TLD by the way)
      pen.is (BIC's homepage in Iceland?)
      vagi.na
      got.root (also real)
      Eat-sh.it
      sniff.co.ck (real TLD)
      Give-a-fu.ck
      por.no
      s.cat
      free.blow.jobs
      felat.io
      sc.um

      goat.se (deserves an honorable mention I guess).
  • Just me (Score:4, Interesting)

    by Rinisari (521266) on Thursday January 03, 2008 @02:18PM (#21897814) Homepage Journal
    Is it just me, or does it seem like the article is really blowing this out of proportion? From my understanding, the Russian government just wants to add a .rf (well, . if I'm remembering Cyrillic correctly). That's it. Users with Cyrillic keyboards will be able to access those sites without a problem, and those of us with non-Cyrillic keyboards will have to either use a character map program or temporarily switch keyboard layouts (as I just did).

    Is that it, or am I missing something?
    • by Rinisari (521266)
      Well, I did have the fancy Cyrillic characters there, but apparently, Slashdot hates UTF-8.
  • by gstoddart (321705) on Thursday January 03, 2008 @02:23PM (#21897940) Homepage
    As it is I see spam which has Chinese characters embedded in what appears to be a google URL, but which I strongly suspect isn't.

    I fear the more we see unicode bytes in URLs the more it will open up people to vulnerabilities as they click on very innocent looking links.

    Hopefully the browsers can keep up with this.

    Cheers
    • by cnettel (836611)
      There have already been some browser fixes, mainly triggering cases where characters from different scripts appear next to each other. That certainly breaks some valid cases as well, but I guess it's bearable. (So you can't just switch a single o in some domain name to a Cyrillic o and get it to show almost indistinguishably... or at least that's the idea.)
      • That's something to be thought, even more if you can mix character sets on domain registrations. Don't the URLs below all seem the same?

        http://www.google.com/ [google.com]
        http://www.google.com/ [google.com]
        http://www.g/#1086;&%231086;gle.com/ [www.g]

        Cyrillic and latin alphabets have a few letters that overlap:

        a and
        c and
        e and
        H and
        k and
        m and (ok, almost, but upper-case still goes: M and )
        n and (kinda)
        o and
        p and
        T and
        x and

        I hope they take this into account when making other characters encodings into dns.
        • And how nice... just noticed slashdot is ISO8859-1 encoded, so my previous post won't display correctly.
          Hey, Slashdot, why not use UTF-8?? Being a (mostly) english site wouldn't show a problem, since US-ASCII and UTF-8 overlaps nicely.
    • How about putting a big hammer and sickle soviet flag icon next to the URL if the url is encoded in cryllic. :)
  • by athloi (1075845) on Thursday January 03, 2008 @02:23PM (#21897960) Homepage Journal
    It's a smart move. Russia has already demonstrated that it wants to be a superpower again, which means that its main competition is China and the USA.

    It has to keep up with China's level of control, and not leave the internet in the hands of the USA, if it can.

    Again Putin demonstrates a smart interpretation of Machiavellian Realpolitik while no one else yet realizes the Cold War is back on.
    • by dusanv (256645) on Thursday January 03, 2008 @03:35PM (#21899208)
      Or maybe, just maybe, they only want Cyrillic characters in URLS. ASCII isn't suitable for majority of the world so brace yourself for more of this in future.

      The article is loaded with bs like this brownish pearl:
      Kleinwachter says the speculation is that people will need a password authorised by government agencies to use the global internet.

      How the fsck did he deduce that from introduction of Cyrillic DNS?

      • Sorry, but the grand parent probably is correct here. The US dominance of the information highway "Internet" is probably not looked favorably upon by the governments mentioned.

        You don't have to be very bright to see that Cyrillic and Chinese are perfectly legitimate reasons for acquiring their own DNS systems, and that they seem prepared to use those reasons. Despite the trouble it sadly WILL create, both in the short and long run.
    • They've topped Saudi Arabia the past couple of years. Saudi has more reserves but nto the incentive to greatly increase production. Both are raking it it in.
  • Icons for Victory (Score:4, Interesting)

    by Doc Ruby (173196) on Thursday January 03, 2008 @02:38PM (#21898196) Homepage Journal
    I'd like the URLs in my GUIs to be displayed in their frame with an icon indicating their character set, and colored if in a character set different from my GUI default. If I had that, I'd like to see "native" glyphs without fear that they're decoys. Even though such a system would no longer force most content publishers to deliver content in my own privileged native character set.
  • internet walls (Score:3, Insightful)

    by pembo13 (770295) on Thursday January 03, 2008 @02:44PM (#21898280) Homepage

    Hindering Paraguayan hackers may matter less to the Russian government than establishing greater control over a walled-off Internet.

    I don't really have a problem with government's filtering the internet of their own citizens -- let their citizens deal with that. When I don't like it is when a government want to control/monitor the the internet usage of other citizens.

  • Trouble ahead? (Score:3, Interesting)

    by Duncan Blackthorne (1095849) on Thursday January 03, 2008 @02:50PM (#21898390)
    I may not be looking at the whole picture here, but isn't this sort of decision going to have a tower-of-babel-like effect? Are search engines going to be able to index sites using the alternative character sets? Isn't there at least some risk of two different sites at least appearing to have identical URLs? Or is this really an attempt by countries like Russia and China to selectively cut their populations off from the public internet while not in actuality doing so? Don't get me wrong, I'm not saying that American English should be imposed on the rest of the world (I'm not that guy!), but the system in place was founded on such and I see this really mucking up the works..
  • by Quiet_Desperation (858215) on Thursday January 03, 2008 @02:58PM (#21898564)
    I'm registering my next domain in Klingon.

  • by DaleGlass (1068434) on Thursday January 03, 2008 @04:08PM (#21899716) Homepage
    If the domain name contains characters not from the system's character set, highlight them (with another color say), and warn the user.

    It's not a new problem either, "slashdot", and "sIashdot" will look the same in many fonts.

Thus spake the master programmer: "After three days without programming, life becomes meaningless." -- Geoffrey James, "The Tao of Programming"

Working...