A World of Languages on the Internet

By Friendless in Internet
Tue Dec 12, 2000 at 04:35:27 AM EST
I've been thinking about the Internet and all the non-English speakers on it. What web sites do they go to, and what languages do they read the web in? I didn't know of any such statistics (except for some which the producer wanted about $1500 for), so I did my own. The results were unexpected.

comments (24)
First, my research methodology. I went to Nua Internet How Many Online? to find out what countries the people on-line were from. The top 12 were USA (148 million), Japan (27 million), UK (19.5 million), Germany (18 million), China (16.9 million), South Korea (15.3 million), Canada (13.2 million), Italy (11.6 million), Russia (9.2 million), France (9 million), Brazil (8.6 million), Australia (7.8 million). Already I was surprised - I didn't expect quite so many from Asian countries with difficult character sets.

Next, I went to Ethnologue to see what languages were spoken in those countries. As this is an amazingly complex question, and the Ethnologue data is not easily downloadable into a spreadsheet, I had to make some gross generalisations and approximations, e.g. all UnitedStatesOfAmericans read the Internet in English. I was also distressingly ignorant about many countries. For example, Ethnologue lists the primary languages of Italy as being Italian (55%), Lombard (15%), Neapolitan (12%), Sicilian (8%), and so on. I don't know whether these are only spoken languages, or have widely used written forms as well. For all I know, Sicilians are taught to read and write Italian at school, and most written communication in Italy is in Italian. Nevertheless, I soldiered on.

I did at least some extra research, and discovered that although there are 7 main languages used in China (Mandarin, Cantonese, Wu, Xiang, Jinyu, Min Nan), the written form was only invented this (20th) century, and is based on Mandarin with Wu influence. So, I can presume that Chinese internet readers read the web in Mandarin. Similarly, there are many Arabic dialects, but the written form is pretty much the same as in the Koran. So I presumed Arabic internet users all use the same form of Arabic. Other countries which were particularly difficult to categorise were South Africa (22% Zulu, 15% Afrikaans, 18% Xhosa), where I still suspect most internet usage is amongst the English and Afrikaans speakers; India (no more than 10% anything); and countries with small internet populations like Nigeria, Morocco, United Arab Emirates, Iran, Thailand and the Philippines, where I had never even heard of many of the languages.

Finally, I combined the data on internet population and spoken languages, to form the combined approximate numbers of internet users by language. No doubt the numbers are wrong, but I at least hope the order is right. The numbers of readers of various languages on the internet are (in thousands): English (184053), Japanese (27000), Mandarin (22954), German (20600), Korean (15300), French (13176) Russian (9524), Portuguese (9300), Spanish (8893), Dutch (7428), Italian (6452), Swedish (4185), Polish (2804), Cantonese (2733), Finnish (2425), Danish (2300), Norwegian (2200), Lombard (1860), Turkish (1818), Swiss German (1536), Neapolitan (1392), Greek (1300), Arabic (1285). After that, the numbers become even more dubious. I find the prominence of Korean most surprising. I expected a lot of European languages, but the Asian languages are right up there.

So what does this mean to you? Well, not much, if you are a reader of the Internet; but it's probably helpful if you are writing for an internet audience. The use that particularly interests me is internationalisation of open-source software. If I am to maximise the circulation of my software, of course I will write it in English, but then I will try to get it translated into as many of these languages as possible. With the prominence of Asian languages in my audience, that's an interesting problem.

I hope this article has made you think about the Internet as a global phenomenon rather than as an American thing. Given the huge number of Chinese readers on the Internet already, it could be the case that in a very few years, the predominant language on the Internet is Mandarin. That's something to think about, and potentially something to make a career of. Please post comments telling me of any errors in my analysis.


To point to the use of non-English languages.. (3.20 / 5) (#3)
by mystic on Mon Dec 11, 2000 at 11:21:08 PM EST

being used in the computer world.
  • NJStar - NJStar Communicator is designed to enable you to view, input and convert Chinese, Japanese and Korean (CJK) characters on normal English or Western windows. It also works equally well on Chinese, Japanese and Korean Windows.
  • Indian Linux Project - The goal of this project is to create a Linux distribution that supports Indian Languages from a GUI/Application level as well as Kernel level.

Interesting idea, but... (3.50 / 4) (#6)
by pb on Tue Dec 12, 2000 at 01:07:56 AM EST

I think you'd have better luck with a search engine.

If you know (a) how many pages there are on the internet, (b) what languages they're in, and (c) how many hits they get, then maybe you could hope to tackle this one. Still, that would be a monstrous amount of data to parse.

Even if you could just see how many hits a major site gets for its local presence in each country, that would help. Or what languages people search for on Altavista, say.

But simply making assumptions on what people read based on what the main or national languages are in a country won't help you. I'm sure that a lot of people in Europe surf the web in English, just because that's a lot of what's out there, and a lot of people know English even if it isn't their native language. And I'm sure a lot of people surf the web in both their native language *and* another language, if they know it...

So basically I could suggest some directions you could take this in if you're really curious, but I don't think you can draw accurate conclusions from the data you're using now.
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
And you're going to have to learn 'em all... (2.66 / 3) (#7)
by Sunir on Tue Dec 12, 2000 at 01:10:06 AM EST

Quoted more or less from my diary entry for 2:25pm December 5, 2000...

I kind of want to relearn French, so I'm picking a really bad news outlet as my wedge back into the language. After all, news generally writes their copy for grade 9 or younger (usually grade 6), and if it's the CBC, it's going to be almost gutteral, baby French.


  • The state of the art of natural language parsing (bad, bordering on hopeless, according to my artificial intelligence prof)
  • My recent experience being "tutored" by Babelfish,
  • The big Spanish language scare recently,
  • The older French language scare (France is petrified),
  • The even older Québec situation and Bill 101, and
  • The reality that (business) people are incredibly more worldly by necessity,

I still have to stick to my belief that we're all just going to have to learn more languages. In the next fifty years, I don't think it'll be surprising to find that most people (children?) in North America know more than two languages, Chinese (of some variety) being one of them.

I'd like to see the Fish handle Chinese and Japanese. A lot of content is just unavailable to me. Even then, it will still be bad enough to force people to learn other languages.

Besides, any good yuppie socialist will have to be able to order from a variety of restaurants in the correct language.

"Look! You're free! Go, and be free!" and everyone hated it for that. --r

Italian (3.25 / 4) (#8)
by molo on Tue Dec 12, 2000 at 01:13:54 AM EST

You made a good guess about Italian and its dialects. Sicilian, Neopolitan, etc. are mainly spoken dialects. There are different (and often conflicting) vocabularies, but when one travels outside of their native region or speaks with someone from another region, the dialect is dropped. The dialects are widely varying, with individual towns and villages often having their own twist to pronunciations or vocabulary.

Furthermore, if a southerner speaks in his dialect to a northerner (or vice versa) they will most likely not understand each other. Common Italian is taught in the schools for this purpose. All official documents and communications do not use dialects. Of the literate Italians, I believe that they all use the same form of written language.

Whenever you walk by a computer and see someone using pico, be kind. Pause for a second and remind yourself that: "There, but for the grace of God, go I." -- Harley Hahn

A slightly different view (2.33 / 3) (#10)
by Wah on Tue Dec 12, 2000 at 02:23:15 AM EST

and a much less detailed analysis.

i get a bit of traffic from that link to my homepage right up there. i run a little analyzer on it. This is pretty much what I've seen.

Most of this traffic has come from discussion forums like this one. For those of you that don't want to click, the breakdown by domains identified as not U.S. are: Italy (.it) 3952 visits 6.6% , Germany (.de) 1335 2.2% , France (.fr) 997 1.7% , United Kingdom (.uk) 930 1.6%, then the Netherlands, Australia, Japan, and Canada in quick succession.

All in all, people from over 100 different country domains have visited my site. This is a small personal site, hosted from my apartment. Quick test, name 100 countries off the top of your head. Keeping this global perspective in mind is pretty much impossible, but it's fun to try.
Fail to Obey?

100 countries (3.00 / 1) (#19)
by Moebius on Tue Dec 12, 2000 at 10:39:44 AM EST

Lessee... (no cheating) USA South Africa Egypt Chad Sudan Moracco Zaire Cote D'Ivore Botswana Chile Brazil Argentina Uruquay Venezeula Bolivia Columbia Peru French Guiana Mexico Belize Nicaragua Costa Rico Cuba Dominican Republic Trinidad Dominica France Nauru Spain Portugal Liechestein (sp??) Monaco Belgium The Netherlands Denmark Sweden Norway Romania Yugoslavia Greece Albania The Czech Republic Russia Ukraine Turkmenistan Estonia Latvia Mongolia China India Pakistan Iraq Iran Saudi Arabia Israel Jordon Lebenon Kuwait East Timor Yemen United Arab Emirates Japan North Korea South Korea Cambodia Austrailia Vietnam Laos Indonesia Germany Austria Switzerland Italy UK Ireland Panama Paraguay Poland Libya Nigeria Rwanda Ethiopia Madagasgar (sp?) Greenland New Zealand Uzbekistan Cyprus Tibet Nepal Thailand Taiwan Botswana The Phillipines Malaysia Singapore Slovakia Mozambique Zimbabwe Croatia Montenegro Phew, that's hard. I'm sure I left out some embarresingly easy ones. Sorry to any k5er whose major industrialized nation is omitted. Feel free to skewer me on the dubious ones.

[ Parent ]
*ouch* sorry Linus (2.00 / 1) (#20)
by Moebius on Tue Dec 12, 2000 at 10:44:13 AM EST

Can't believe I forget Finland... Guess that means I have to delete my account now. Goodbye.

[ Parent ]
major ones (3.00 / 1) (#22)
by mikpos on Tue Dec 12, 2000 at 04:11:23 PM EST

Yes, this is a silly post, but you asked for it :)

Some major ones (besides Finland) you missed are Iceland (which is especially odd since you included Greenland) and Jamaica (actually you seemed to miss out on a lot of the Caribbean). I don't know if those are "major", though.

I'm a bit fuzzy about the UK; for some reason, I thought (though I'm probably wrong) that the UK was actually made up of three separate countries: Scotland, Northern Ireland and England and Wales. I have no idea where the concept of Great Britain fits in there either :\. It's a mess whatever it is. :)

[ Parent ]

UK (2.00 / 1) (#23)
by Moebius on Tue Dec 12, 2000 at 05:22:39 PM EST

Yeah, I've always been fuzzy about the UK as well. I just kind of went by the country code tld's... "England" "Great Britain" "United Kingdom" - ah well one of these days I'll figure it out...

[ Parent ]
Great Britain - the story (none / 0) (#37)
by Foul_Irony on Wed Dec 13, 2000 at 05:30:30 PM EST

I am amazed by the lack of knowledge on this site of things outside the US, but this is not the point of this posting!

the UK is England and Wales (Wales being politically part of England since 1282; it now has its own political assembly), Scotland, Northern Ireland, Isle of Man, and Jersey.

Great Britain is the Island, Little Britain being the name of the Isle of Eire around the 1400 ish.

And in case anyone is wondering, the Commonwealth are all though countries that were part of the Empire of the 1900's - Although Australia left and South Africa were kicked out.

I hope this helps you realise that there is a world the otherside of the East Coast!

[ Parent ]
clarification (none / 0) (#39)
by mikpos on Wed Dec 13, 2000 at 06:29:20 PM EST

Okay so Great Britain (like Little Britain) is just an island, e.g. it's just a geographical area, not a political one? And the United Kingdom is similar? It is just a collection of a bunch of separate nations? e.g. here in Canada we have our own Head of Government, but the Queen is our Head of State; is that the way it is for countries of the UK, e.g. Scotland has its own Head of Government?

On a completely different tangent, how come the Commonwealth countries don't get elite royal family members assigned to them like some areas in the UK do? Wales gets its own prince; York gets a duke. Where's Canada's duke dammit?! Surely there must be enough royals to go around. I'd even settle for one of those York daughters or something. I'm sure they could easily take over the duties of our Governor-General. I'll have to start up a petition I guess. I want a duke/duchess, dammit, or an earl/countess at the least.

[ Parent ]

um .. not a royalist, but here's what I know .. (none / 0) (#44)
by Foul_Irony on Thu Dec 14, 2000 at 10:40:38 AM EST

The royals have multiple roles, although they are cutting down a little as costs are high.

Most of the top royals have more than one title e.g. Prince Charles is the Prince of Wales and the dutchy of Cornwall, but generally, they don't take up titles outside the UK, Normally the various royals are given titels once they turn 18 or so.

You can buy yourself a lordship of the manor if you want, which will give you a title, which you can use, but is not recognised by the monachy. The going rate is about $1500 for the lordship of a small little town in the back of beyond.

As for the UK; the UK is a political collective, It has the Queen as head of state, It has a prime minister who governs the various countries and areas of the UK. Scotland has a first Minister, which is as close as you'll get to a prime minister - he gets to play with tax but not tanks!

Northern Ireland has an assembly as does Wales, both are allowed to alter minor laws and alocation of funds, but they have no real power. The Isle of Man and Jersey are basically countries in their own right, but do not have military powers.

[ Parent ]
question (none / 0) (#45)
by kubalaa on Thu Dec 14, 2000 at 05:23:15 PM EST

I'm incredibly uninformed but I also can't imagine what it means to distinguish between countries with a single shared government. Do the various countries of the UK have their own parlaiments? If they're still controlled by the PM, what's the point of calling them "countries"; why not states or provinces or whatnot?

[ Parent ]
Its from the old world (none / 0) (#46)
by Foul_Irony on Thu Dec 14, 2000 at 05:49:16 PM EST

The UK is an old country and as such has lots of tradition. To all extents its been around in its present form since the Roman Empire. One of the things that happens with history is that politics comes and goes but the people and their culture are always a part of the land. Scots have always lived in Scotland, the Welsh have always lived in Wales, and the English .. well, they are actually the French who could swim about 1000 years ago and invaded!!!

Scotland is, to the layman a country in its own right, but, because of its common connections with the rest of Great Britain, it would be impractical to introduce further measures to make it a fully independant countries, such as borders, armed forces etc.

Wales on the other hand has the will to be a country but doesn't really have the infrastructure to work successfully.

The point of my post is that its only politics that make states and provences, where as it takes history to make Countries. e.g. The Romans defined the borders of Scotland and Wales with walls, but the people who lived on the borders knew if they were Scotish, Welsh or Saxons and they will always be that, nevermind whatever the politicians tell them. (Northern Ireland is about this issue, but please don't even think about asking me about that subject, it is much too delicate!)

I guess the nearest way of defining it in an American way would be the way people have feelings for the City they live in, and even though you have certain laws in that Area, Washinton tells you how it is in a World context.

I hope that helps you understand European Politics a bit better!!

[ Parent ]
Make that 98 or so (4.00 / 1) (#24)
by Friendless on Tue Dec 12, 2000 at 07:09:23 PM EST

Greenland is a part of Denmark, so is definitely not a country. East Timor is planning to be a country, but is at the moment administered by the United Nations. Tibet claims to be independent, but has been occupied by China since the 60s. Taiwan claims to be independent, and acts that way, but China won't admit it. I'm not sure what the current state of Montenegro is, I thought it was still attached to Serbia as Yugoslavia. You didn't mention Canada at all, as far as I can see, which us only embarrassing if you consider that it is right next to the U.S. and is the second largest in the world :-). The correct spellings are Liechtenstein and Madagascar. Overall, a very good effort! (BTW, by most standards, you would be allowed to break the UK into England, Scotland and Wales, at least.)

[ Parent ]
Canada (none / 0) (#32)
by Moebius on Wed Dec 13, 2000 at 06:33:23 AM EST

Wow, I did miss Canada... I thought I had it early on (doh). Um, sorry Canadians. Oh well, I don't see anyone else trying :P

[ Parent ]
MLP (4.00 / 3) (#11)
by vastor on Tue Dec 12, 2000 at 02:36:48 AM EST

Language Resources has links to some sites that discuss these very issues.

While Languages of the World discusses languages in general, though once again its figures are based on primary language rather than taking into account secondary languages.

Going by my personal experience, german is probably the most prominant non-english language I've come across on the interet except for when looking for drivers etc and then I've often become stuck at taiwanese/japanese sites.

Maybe in a hundred years a language might become predominant, but certainly for the next ten or twenty I think the internet will probably be largely language-divided. Mandarin, English and Spanish would make for a good three suspects for the future - being fluent in a couple of languages isn't -that- impossible (and is normal in one or two countries), so I think we might well find that two or three will end up predominant on internet.

English certainly has starters language and benefits from the fact that english speakers probably have a higher per capita income across the board than any other major language. So it's probably going to be fairly safe in it position for a while to come and the others will just creep up around it.

spoken and written chinese are different languages (3.66 / 3) (#12)
by TuxNugget on Tue Dec 12, 2000 at 03:21:09 AM EST

The mandarin, cantonese, and japanese(!) share the same chinese characters, but the spoken langauges are distinct. Thus, you have a situation where two people might not be able to communicate verbally, but can communicate in writing.

Unlike western languages, these characters are not phonetic. They are much more like hieroglyphs, in that a character is a word.

Japanese kanji script, I believe, involves phonetic symbols. That is the exception.

Disclaimer: I am a white guy.

Egyption heiroglyphs are partly phonetic (none / 0) (#14)
by Paul Johnson on Tue Dec 12, 2000 at 05:44:17 AM EST

Actually heiroglyphs are a partly phonetic alphabet and partly idiomatic one. See here for details.

Paul, aka [Streetplan, Eagle, Chick, Lion].
You are lost in a twisty maze of little standards, all different.
[ Parent ]

yes, but (none / 0) (#15)
by boxed on Tue Dec 12, 2000 at 06:49:40 AM EST

They were phonetic only to a certain degree and they were also only phonetic when they were created. After several hundreds of years the phonetical heiroglyphs no longer coincided with the spoken language.

[ Parent ]
Not exactly correct... (none / 0) (#42)
by Ogantai on Thu Dec 14, 2000 at 03:56:20 AM EST

Japanese and Chinese may use the same characters, but all but the most simple have subtly diffrent meanings. Japanese also has 2 set of phonetic symbols, one of them for approximating the pronunciation of foriegn characters, and another for denoting grammer (ie: subject and object markers, posession, location, ect... much like Korean has (the markers I mean, Korean always uses a phonetic alphabet (and I do mean alphabet, the Japanese phonetic sets are technically a syllabary, not an alphabet))

There are also 2 forms of written Chinese: Traditional and Simplified, Mandarian (pu tong hwa) is spoken in mainland China and uses simplified characters. On Taiwan, Mandarin in also spoken, but traditional chinese is used (there's also a language called taiwanese, but that's different). In Hong Kong, they speak a different language called Cantonese (yueh), which is mutually unintellegable in it's spoken form with the 6 other Chinese languages, but is written in the same manner, using traditional characters.

DisclaimerL I'm also a white guy, but I speak Vietnamese and Korean

[ Parent ]
One globe, one language (3.57 / 7) (#13)
by Beorn on Tue Dec 12, 2000 at 04:53:30 AM EST

In my honest opinion, non-english readers should be ignored on the internet, even if you're writing for a specific national audience, (ie the local newspaper). If it's worth publishing on the web at all, it's worth publishing in a language many people know, and right know that language is english.

I have no particular wish to preserve my native language norwegian for posterity, and mainland european governments should be ashamed for trying to encourage nationally limited content and art. It's a tragedy that only scandinavians will ever read Henrik Ibsen or watch Ingmar Bergman untranslated, and the sooner we switch to a global language, the better.

- Beorn

[ Threepwood '01 ]

And what exactly is English? (3.00 / 2) (#16)
by Foul_Irony on Tue Dec 12, 2000 at 08:55:37 AM EST

Comments like this annoy me!!

I am amazed at how bad other nationals understanding of English actually is.

I am a resident of the UK and learnt English as a third language behind the languages of my parents (Welsh and Danish).

It seems to me that people who can understand an American Blockbuster film or can sing the lyrics of a Madonna song think they understand the language of English. Americans can't even spell in English!!

English as a language for communication should be treated as just a way of saying hello and pass generalist pieces of information - and not as a replacement for native languages.

This is best demonstrated with the use of Irony, Sarcasm and dry wit in the UK that just isn't understood outside the country. English is our native language and as so cannot be understood by people outside the country - Remember that English spoken in other countries is based on English as it was introduced in those countries - up to 200 years ago.

As for using English for local area subjects - possibly the most arrogant attitude I can imagine. Let me ask this question; Do you really think that a nation should publish its documents in a language foriegn to its self, just so others can read the documents? Why? If you are that bothered to find out the contents, learn the language, or use technology and use a bable (or prehaps a babble!) fish!

The other major point about language is that each language and dialec have terms and words that describe in a form that is more delicate and precise than other forms of the language would allow. For instance, the word "wee". I can think of two meanings to this word, can you?

Anyway, this might not be the best area to discuss language - but I look forward to reading the replies!

[ Parent ]
Irrelevant (5.00 / 2) (#18)
by Beorn on Tue Dec 12, 2000 at 10:23:10 AM EST

Let me ask this question; Do you really think that a nation should publish its documents in a language foriegn to its self, just so others can read the documents?

If all intended readers know the global language, yes. A norwegian interested in american politics have houndreds of newspapers and magazines available. An american interested in norwegian politics have nothing. For the benefit of everyone else, nothing sensible or beautiful should ever be written in norwegian.

For instance, the word "wee". I can think of two meanings to this word, can you?

How is this relevant? I know what wee means, but I don't see your point. Language evolves on its own, words are invented, imported and forgotten. Whether we use oxford english, global english or chinese is irrelevant, as long as everyone understands it. TV and the net is making this happen.

(And british humour not understood? It's world-famous!)

- Beorn

[ Threepwood '01 ]
[ Parent ]

one world language?? (2.00 / 1) (#21)
by Foul_Irony on Tue Dec 12, 2000 at 12:41:47 PM EST

I feel the need to reply, even though I really don't think I will manage to get my point across ... although, this possibly demonstrates my point perfectly.

What if a Frenchman wanted to know about Norwegian Politics? Should the frenchman have to learn English in order to learn about Norwegian politics? Why not just learn Norwegian and cut out the middle man?

I chose Wee because it was an interesting little word, meaning
3:to urinate,
4:a sound made by a pig
as well as being a TLA for a few different organisations.

My point was that the meaning of the word can very easily be misunderstood, depending on its context, and the safest way of ensuring total understanding would be to teach everyone to the same level - or stick to our own languages and let technology take the strain.

As for British comedy, the likes of Benny Hill, Faulty Towers and Monty Python, although funny, do not contain the humour I was discussing in my earlier reply.

Anyway, I hope you at least understand my objection - Using any one language as a world language just won't work, and using Norwegian as your basis for its benefits is just cheating!

[ Parent ]
Impossible? (none / 0) (#28)
by Beorn on Wed Dec 13, 2000 at 03:08:53 AM EST

What if a Frenchman wanted to know about Norwegian Politics? Should the frenchman have to learn English in order to learn about Norwegian politics? Why not just learn Norwegian and cut out the middle man?

I shouldn't have to point out that more and more french speak english, and that if not only norwegians but germans, belgians and russians switched to english, you would only have to learn one language to access the entire output of four different nations.

My point was that the meaning of the word can very easily be misunderstood, depending on its context, and the safest way of ensuring total understanding would be to teach everyone to the same level - or stick to our own languages and let technology take the strain.

So? The choice of english as a global language has already been made, whether you think it's a good idea or not. What's left is for other european countries to take the consequences of this. It's a simple question of data format compatibility.

Anyway, I hope you at least understand my objection - Using any one language as a world language just won't work, and using Norwegian as your basis for its benefits is just cheating!

I understand your point now even less than before. You're doubting the practicality of using english as a world language, with the entire world around you proving you wrong. And you haven't adressed my argument that it is good for everyone to understand everything that is written.

- Beorn

[ Threepwood '01 ]
[ Parent ]

This is proving my point better than ever..! (none / 0) (#36)
by Foul_Irony on Wed Dec 13, 2000 at 04:47:16 PM EST

Having read your reply, it seems obvious to me that I have proved my point! Due to your inability to read between the lines, because you are only using English to communicate facts rather than feelings, you can not see it. As far as I understand .. English is the 2nd most popular language on Earth, behind Chinese (whatever flavour!). Statistics will show you that more people don't speak English than do; and most of those who think they do, can't do more than order a taxi and ask for directions to the local hippo! Surveys have shown that most non-english speakers understanding of English is worse than that of a 6 year old English child, Is this the answer to the world communication problem? As for the internet being a written form of English; We have the problem of context, spelling, punctuation as well as all the limitations text has in describing certain issues. I think my main gripe is not the fact that we are discussing the choice of a universal language, but the fact that you are dismissing every other language just because you don't see it very often on the pages you visit. In 1995 I wrote a Welsh internet page in a town who's inhabitants had spoken Welsh for at least 1500 years. I don't paticularly want to throw 1500 years of culture away, just so a monoglot can read my jottings. Here is a techie analogy for you; From tommorrow, if you want to use the internet, you must only use a Macintosh, but, even if you don't want to use the internet, you have to use a Mac, just in case somebody else you know, or might meet one day, wants to use your computer to use the internet. Obviously, I don't expect you to be pedantic enough to point out the flaws in my analogy, but its a simple description of how I feel about the subject.

[ Parent ]
translators (3.50 / 2) (#35)
by mikpos on Wed Dec 13, 2000 at 03:11:46 PM EST

Would it not be much simpler to use a translation to get at what you want? Mind you this has the obvious disadvantage that if you want something translated, either (a) you have to be rich (enough to hire a translator); (b) it must be popular enough that someone else wanted it translated; or (c) you have to learn the language yourself to translate it :).

Probably the vast majority of cases would fall under (b). This isn't a huge problem as I see it, though. If I take film for an example, foreign ("foreign" being non-North American, as I'm in Canada) films are somewhat hard to come by, especially in the theatres, but this is just because of the economics of the theatre industry, not because of any language barrier. There are few European or Asian (or South American or African or Antarctican for that matter) films shown in the theatres here, but there are also few British films shown here as well. In terms of the costs involved for a Canadian such as myself to see a Norwegian film, the cost of adding sub-titles is just a drop in the bucket, so I don't think a Norwegian film being shot in English would add anything substantial to its availability.

You seem be dismissing outright, though, the benefits of having things done in a language other than English. Perhaps this is because European languages, and Norwegian (well, actually Danish, I guess, but close enough) especially, are so closely related to English. I know some French and Swedish and know that for those two languages anyway (which I'm extrapolating to most European languages), the differences between those and English are basically vocabulary. There are a few grammatical quirks here and there, but nothing substantial. I think you'd be extremely hard-pressed to find something in French or Swedish that couldn't be translated directly to English (or vice versa). There are a few words that would come from Gaelic or similar, but most of English (and especially its structure) has been taken from other European conquering nations.

The fact of the matter is, there are a large number of languages (mostly those which have tried to stay outside of European influence) that are structurally quite different from English. To have to use English in their daily lives would be quite awkward I'd think. Things meant for internal use (such as art) would similarly be awkward. If you're talking about explicitly international documents (such as those made by the ISO, which I might add has already standardised on English and French), then global language(s) makes sense, but I don't know that there are any real advantages to convincing people to make regional documents in a global language.

[ Parent ]

Translation (5.00 / 1) (#38)
by Beorn on Wed Dec 13, 2000 at 05:36:45 PM EST

Would it not be much simpler to use a translation to get at what you want?

I think anyone who has read a great book both in the original language and the translation will agree that the original is usually better. How do you translate a poem without rewriting it, and who can rewrite a brilliant poem? It's very dependent on the translator.

When it comes to movies, it's even more difficult to translate because you're limited by the space for subtitles. The quality of subtitles are in my experience appalling. Long, colourful sentences are condensed into short, dry summaries, and often completely wrong summaries at that. So wouldn't it be better if norwegian films, (assuming they were any good), were originally written in english?

And of course, in music there is no such thing as translations, so language is everything. How many would ever have heard of Abba if they had sung in swedish?

You seem be dismissing outright, though, the benefits of having things done in a language other than English.

I don't know anything about language structure. I agree there are things that can only be expressed in certain languages, that the basic syntax is related to how one thinks. But I'm not really trying to force everyone to learn a completely different language, I'm encouraging those countries who already know english to use it more. Whether asians, arabs and africans wants to follow or select another candidate is up to them. What's important is that governments don't interfere with the natural and healthy movement towards fewer languages.

I think you'd be extremely hard-pressed to find something in French or Swedish that couldn't be translated directly to English (or vice versa).

I disagree. Language is a very powerful tool of communication, and a direct translation can never fully capture all the meanings, nuances, and conscious and subconscious associations of a sentence.

One example: The english phrase 'to make sense' has no good norwegian equivalent. 'That makes sense' is a short, elegant, to-the-point statement. 'Det høres fornuftig ut' (that sounds reasonable) is longer, and sounds (to me) awkward and formal. There are plenty of other examples. I write very differently in norwegian, I use different expressions and different ways to present my thoughts. I'm not even sure I could translate myself without losing something.

but I don't know that there are any real advantages to convincing people to make regional documents in a global language.

Well, there are advantages for everyone else. I would, once in a while, like to read an ordinary french small-town newspaper, unfiltered through norwegian correspondents. I can't. I never will. But I can read any american paper I want. So in a way, the european defense against american culture is counter-productive, because it actually reduces knowledge of other european countries, giving US and UK more cultural influence than they deserve.

- Beorn

[ Threepwood '01 ]
[ Parent ]

hmm i see (none / 0) (#40)
by mikpos on Wed Dec 13, 2000 at 07:02:43 PM EST

I see where you're going. Ya I suppose the government should back off a bit.

Anyway, I still don't see any pressing reason for a non-anglophone paper to publish in English if it's a mainly regional paper. Writing in the native language would make communication a lot easier I think, and would solve many a headaches when dealing with quotations. If you're reading to or writing about local French politics, you'd probably be best off learning a bit of French (especially since written French is so easy to pick up if you already know English :P).

[ Parent ]

Reality check! (`linguistic supremacy', etc) (none / 0) (#26)
by j3z_ on Tue Dec 12, 2000 at 09:04:15 PM EST

Surely the language spoken by a given group of people forms a fundamental part of their culture and identity, and to advocate that non-english material be ignored is to disenfranchise a great many people?

This is not analogous to the trouble caused by, say, proprietary html tags: nobody spoke html before about 1994, and everyone stands to benefit from well designed technical standards.

Esperanto was constructed in this spirit.

- Jeremy

[ Parent ]
Artificial preservation (none / 0) (#31)
by Beorn on Wed Dec 13, 2000 at 06:32:39 AM EST

Surely the language spoken by a given group of people forms a fundamental part of their culture and identity, and to advocate that non-english material be ignored is to disenfranchise a great many people?

Well, what I'm saying is that now is the time to fix the problem. There are countless disadvantages to multiple languages, such as valuable information and great artists being limited to a tiny portion of humanity. So the more everyone writes in english, the better for everyone.

I realize this isn't going to happen, that english will propably remain a second language in mainland Europe for some time. I just feel it would be a tragedy if european social architects succeed in encouraging non-english art. Elitist and near-sighted cultural ministers are certainly trying very hard.

As for identity, I personally feel like a western netizen of norwegian heritage. Norway, which only existed as a separate culture from the early 19th century to late 20th century, deserves to be remembered, but not artificially preserved.

- Beorn

[ Threepwood '01 ]
[ Parent ]

Artificial homogenisation (none / 0) (#43)
by j3z_ on Thu Dec 14, 2000 at 06:19:11 AM EST

Can you allay my concern that you are saying ``now is the time to fix the problem'', and ``the sooner we switch..., the better'', in the spirit that Hitler did in the thirties?

I recognise that this is a confronting question, and perhaps too strongly worded, but the suggestion that it is `elitist', `tragic' and `near-sighted' to encourage non-english-language art is to suggest that other languages contribute nothing which can't be expressed in english, and is also contradictory to other statements you have made.


Regarding other comments:...

It's a simple question of data format compatibility.

As you and others have commented, human language does not separate well into content and formatting (so to speak). The question is therefore not simple.

And of course, in music there is no such thing as translations, so language is everything. How many would ever have heard of Abba if they had sung in swedish?

Sorry? Have any non-Germans here ever heard of Bach, Mozart, Beethoven or Wagner? Have any non-French people here ever heard of Debussy? Have any non-Italians heard of Puccini or Verdi? The [vocal] music of all those composers (as well as composers from practically every other non-english-speaking country; excuse my european bias) is performed internationally far more often in its original language than in translation!

In my honest opinion, non-english readers should be ignored on the internet
But I'm not really trying to force everyone to learn a completely different language, I'm encouraging those countries who already know english to use it more.

This would appear to promote a very one-sided kind of `global' language.

[ Parent ]
Esperanto (3.00 / 1) (#33)
by dabadab on Wed Dec 13, 2000 at 07:48:18 AM EST

Yes, esperanto is well designed and clean.
BUT: that is mainly because nobody uses it.
Every language starts out as logical and well-structured, but people have a tendency to mix things up :) and you can bet your life that it would happen to Esperanto too if it would be ever given the chance.

English is OK, since it is relatively easy to use for communication, and "everybody" (everybody who is an educated member of the western culture) speaks it.

And I don't think that Chinese would be adapted as a widely spoken "common language" because it is

  • hard to learn
  • hard to use electronically
  • China has to go a long way to have a really big influence on other countries
  • )

Real life is overrated.
[ Parent ]

export LANG=en_RN (3.25 / 4) (#17)
by YellowBook on Tue Dec 12, 2000 at 10:13:36 AM EST

It's interesting that you bring up the issue of "dialects". Non-standard forms of a language (i.e., those forms not spoken by a national elite) tend to get the short shrift when doing localization, which is kind of sad.

RedHat used to support the en_RN locale, which provided localization for the southern U.S. (the RN stands for "redneck"). This was fun as well as extremely useful. However, RedHat dropped support for en_RN in their 6.x series, in order to cozy up to the linguistic elite who speak U.S. Network English (i.e., to appear more professional).

But I'll miss the installer program saying "firin' up CDROM". O tempora! O mores!

Languages and the Internet (4.00 / 4) (#25)
by Chakotay on Tue Dec 12, 2000 at 07:57:13 PM EST

I'm rather surprised to see Dutch so high up there. Did you count Flemish (Belgian Dutch) and Afrikaans as Dutch too? The Dutch always seem to want to assume an underdog position when it comes to languages, adapting rather than forcing others to adapt. This is part of the reason why the French language border is moving north.

Personally I browse the web preferably in British English or Dutch, but I'm definitely not scared off by American English, German or French - though I may not speak French fluently (far from it), and have a hard time understanding spoken French, I do understand written French very well. I even have all the Asian resources installed so I can see Japanese or Chinese websites in their full glory instead of getting the as:OSIoHGE:OEWHH":CD-ROM$#Yngg{)Windows98*WN{ )#(E UIF[)( Linuxtgh'o)( W# f effect :)

My personal webpage I made in English, with the original idea to also make a Dutch and possibly a German version, but that never got off the ground. Basically almost all people who visit my site can understand English, so why would I go through the trouble of adding Dutch and German? I'm currently working on a site for a local art academy though, which is going to be in Dutch, German and English.

What also kind of surprises me is the amount of Asian people on the Internet. A good friend of mine has a Japanese girlfriend, who used to work at one of the major Japanese newspapers. There, she was about the only person able to speak English, and quite literally the only person who could use a PC. She spent hours on company time writing emails to my friend because nobody understood what she was doing anyway. To use a PC, you need to know at least one Western language, preferably English. Many Japanese cell phones have fully fledged web browsers in them though, not just a mangy WAP browser, but true web browsers capable of reading HTML. In the Western world, the Internet is a big network of personal computers - in the Eastern world, it's being integrated in hand-held devices. By far most Japanese have never, nor will ever work with an actual desktop or even laptop PC. Actually, when my friend's girlfriend first came to the Netherlands she was utterly surprised to find a totally graphical operating system sitting on a 19" screen on his desktop.

Anyway, the Internet culture in East and West is totally different, and it will take a long time for them to merge, if they ever will.

Linux like wigwam. No windows, no gates, Apache inside.

Dutch (none / 0) (#27)
by Friendless on Tue Dec 12, 2000 at 10:08:57 PM EST

I did not count Afrikaans and Flemish as Dutch, as I didn't know whether the written forms were mutually intelligible. Please give more details. Thanks.

[ Parent ]
Re: Dutch (5.00 / 1) (#34)
by Chakotay on Wed Dec 13, 2000 at 08:56:14 AM EST

Flemish and Dutch are essentially the same. There are subtle differences in nuance, vocabulary and grammar though. Most of those differences involve Belgians preferring one right way of doing something while Dutch prefer another. A very commonly known example is the expression for "definitely", which in Dutch is "vast en zeker" and in Flemish "zeker en vast". Also, in Flemish some words are used that are archaic in Dutch. In Dutch there are various pronouns for "you (singular)", namely "je", "jij", "u" and "U". Flemish also still has the archaic forms "ge" and "gij" which have a nuance somewhere inbetween "u" and respectively "je" and "jij". Belgians also prefer to place verbs somewhat differently when multiple verbs are involved. So basically while both Flemish and Dutch are correct Dutch, there are definitely differences, and any Belgian or Dutch people will be able to distinguish them at first glance.

Afrikaans is a whole other can of beans. Hundreds of years ago the Netherlands and Belgium were one, and Flemish has only very recently (as in, in the last few decades) become to be recognised as a language on its own. Afrikaans however was different from the start. It evolved thousands of miles from the Netherlands so with no real contact to the main body of Dutch speakers, and it didn't evolve from standard Dutch as Flemish did, but it evolved from various Dutch farmers' dialects. After a few hundred years of evolution and outside influences from English and also to a small extent from local African languages, Afrikaans has become totally different. Most words have shifted meanings, words for new stuff like elevators other electronic thingamajiggies are totally different, and most noticeably the grammar has completely changed. To a Dutch person, Afrikaans looks like completely mangled Dutch, but is still inteligable to a reasonable extent.

For example, I saw a box of African wine some time ago. It had a big arrow on the side with the text "hierdie kant bo", indicating "this side up". In Dutch, that would be "deze zijde boven". Mangling the English "this side up" in a similar way, you would get something like "that-there end tops". Another example that indicates the kind of semantic shift that you see a lot between Dutch and Afrikaans is the words for skin. In Dutch "the skin" is "de huid", but there is also a more ordinary word for skin, which is "het vel". In Afrikaans, it's "die vel". Very often ordinary words that are spoken much more often than written in Dutch have become the default for written language in Afrikaans.

Such a shift happens a lot during any form of colonisation though. Another nice example of such a shift is the French word for "horse", which is "chéval". The Latin word for horse is "equus", so at first glance you would say "chéval" doesn't stem from Latin - but it does. It stems from the Latin word "caballus", which was an ordinary word used only by farmers and soldiers, kinda like the English word "jade" and the Dutch word "knol", which indicate old, shabby or otherwise bad horses.

Coming right around back to topic, I would classify Flemish as Dutch, because the difference between Flemish and Dutch is basically comparable to the difference between the various forms of English spoken natively around the world, or actually even less, because there are no spelling changes, only some subtle syntactic and semantic shifts. Afrikaans I would classify as a different language. A native Dutch speaker and a native Afrikaans speaker will be able to communicate, but they will both have a pretty hard time understanding the other, which puts it basically at the same level of difference as between Danish, Swedish and Norwegian.

Linux like wigwam. No windows, no gates, Apache inside.

[ Parent ]

possible major mistake for Canada's language.. (2.66 / 3) (#29)
by tlv87 on Wed Dec 13, 2000 at 03:48:40 AM EST

Did you count Canada as 1/4 french as it is ? That would give French predominance over Korean ;-)

I'm from Quebec though and I rarely browse the web in french. Its simply because the best sites (yeah, like this one) happen to be in english. But I don't represent the majority.

Had to affirm my nationality ;-)

Canada (none / 0) (#41)
by Friendless on Wed Dec 13, 2000 at 08:17:08 PM EST

The Ethnologue entry for Canada was confusing. I counted it as 20% French and 50% English, and the rest got lost.

[ Parent ]
Language popularity on ODP (4.66 / 3) (#30)
by verylisa on Wed Dec 13, 2000 at 04:58:41 AM EST

The Open Directory lists over 300,000 non-English sites at dmoz.org/World. It might not tell us how many people are reading the web in a particular language, but it gives us a rough idea of how many people are making websites in a particular language, and we could extrapolate from there.

(Disclaimer: I am an ODP editor.)

The most popular languages by number of sites listed are:

  1. Deutsch [German] (70,778)
  2. Polska [Polish] (55,543) (except a lot of these are deeplinked encyclopaedia entries, so it's not as big as raw figures would suggest)
  3. Español [Spanish] (45,358)
  4. Français [French] (30,562) (there are lots of French sites sitting in unreviewed at the moment due to lack of active French editors, so French should probably be higher up the list)
  5. Svenska [Swedish] (23,382)
  6. Nederlands [Dutch] (17,206)
  7. Italiano [Italian] (16,802)
  8. Japanese (6,114)
  9. Korean (5,491)
  10. Català [Catalan] (4,937)
  11. Russian (4,143)
  12. Suomi [Finnish] (3,554)
  13. Dansk [Danish] (3,253)
  14. Norsk [Norwegian] (2,882)
  15. Euskara [Basque] (2,827)
  16. Chinese, Simplified (2,267)
  17. Indonesia (2,046)
  18. Chinese, Traditional (2,003)
  19. Esperanto (1,767)
  20. Românã [Romanian] (1,655)
  21. Czech (1,462)
  22. Türkçe [Turkish] (1,219)
  23. Português [Portuguese] (1,026)

Note that these figures depend partly on editor enthusiasm. Those languages which have editors actively searching for sites to add are better-represented than those with only a few editors reviewing submitted sites only.

A World of Languages on the Internet | 46 comments (41 topical, 5 editorial, 0 hidden)
