GEO question index
This page lists questions that people may have about Web internationalisation. This is an evergreen internal working document - ie. it
contains text and ideas that are guarranteed not to be perfect, and it is subject to constant, ongoing change and modification. Its purpose is to
help the GEO Task Force members discuss, plan, and develop the material that is made available on the I18N site.
Questions have been harvested from several sources, including: brainstorming, queries sent to GEO members, mail to the
[email protected] list, etc. Some questions have been borrowed from Yves Savourel's FAQ
page.
Questions that have already been dealt with, partially or fully, somewhere on the I18N site have links beside them. In this respect, this
document provides a useful tool for locating information about a given topic on the site.
- What is the difference between localization, internationalization, and globalization?
- Why are the benefits of adding international-friendly code?
- What are the first steps in migrating an existing HTML-based site to an international-friendly multilingual enabled XHTML site?
- What is an 'international' or a 'multilingual' web site?
FAQ
- What are the trade-offs between international sites that are monolingual vs. multilingual?
FAQ
- What are the next steps...?
- Can "alt" tags be changed based on the user's language selection?
- How can I make sure that my multilingual site is indexed appropriately by search engines?
- Do display capabilities of computers in foreign countries vary from those I'm used to? Do I need to worry about screen size, number of
colors, etc.?
- Which web authoring applications can be used to create multilingual XHTML?
- What design & editorial considerations are there for multilingual sites?
- Can I write HTML and XML element and attribute tag names in languages and scripts other than English?
FAQ
- When I'm using UTF-8 in some user agents, why do an extra line and sometimes unwanted characters appear at the top of my web page, and how
do I remove them? FAQ
- Why does my browser collapse spaces between Latin and Arabic/Hebrew text?
FAQ
- What is a charset?
- What is a character encoding?
- Which encoding does country x use?
- Can I use utf-16 in html web pages?
- How do I specify the encoding of an HTML, XHTML, XML or CSS document?techniques
article
- Which encoding declaration takes precedence for an HTML page, server or client?
- Where can I find the charset names?
techniques
- How do I set up my server to serve the right encoding for a page? article
- How do I set character encoding in my web authoring applications?
FAQ
- Can I encode XML and X/HTML in non-Unicode encodings, if the document character set is Unicode?
FAQ
- How much support exists today for HTML written in Unicode? FAQ
- If I use a Unicode encoding for a page, is UTF-8 or UTF-16 best?
- How do I handle characters that are not supported by the encoding used by my page?
- What's the difference between Numeric Character References (NCRs) and character entities, and which should I use in XML and (X)HTML?
- How do I declare the character encoding of a CSS style sheet, and do I have to?
FAQ
- What is the default encoding of an XML document?
- What does 'document encoding' mean? Can't I use any encoding I want?
FAQ
- How do I remove the UTF-8 BOM (byte order mark) from my source? FAQ
- How can the UTF-8 BOM (byte order mark) screw up my CSS stylesheet? And what can I do about it?
FAQ
- Why do I need to normalize text on the Web?
- How can I check the character encoding information sent in the HTTP header of a web document?
FAQ
- How can I check that the character encoding of my document is correct using the W3C HTML Validator?
FAQ
- How do I handle control codes (ie. the 'C0' U+0000-U+001F and 'C1' U+007F-U+009F ranges) in XML, XHTML and HTML?
FAQ
- How do I choose fonts for my CSS styling that will be recognised on all platforms?
- What's the best way to deal with characters from other scripts that I don't think people will have the fonts or rendering algorithms to
display properly?
- Where do I find fonts and rendering support for multilingual scripts?
- How do I deal with missing characters and glyphs? article
- Should I mark up language information for foreign loan words, or romanized text in content written in a non-Latin script?
- Do I need to use the xml language declaration for XHTML pages? And if so, do I need to specify language elsewhere?
- Is the hreflang attribute used?
- Should I use RFC1766 for HTML and XML, or the more recent RFC3066?
- Does en default to en-US, ie. US English? What about other languages?
- Can I use Unicode Language Tags in XML?
- How do I use the lang() function in XPath?
- How do I use the lang() selector in CSS?
- I diligently mark html text with the appropriate lang language identifier. The text doesn't display any differently. How do I test or
evaluate that all the text is marked correctly; In what situations does it make a difference?
- How do I declare the directionality of a whole document?
techniques
- When should I use the dir attribute, and when should I use the ‏ and ‎ entities?
techniques
- If I have arabic/hebrew text in parentheses in a latin document, should the parentheses be part of the arabic/hebrew text or the latin
text?
- How to create and edit bidirectional Web pages?
techniques
- How to make sure bidirectional Web pages are accessible?
- Should I use styling for indicating bidirectionality in HTML or XHTML?
FAQ
- Should I use styling for indicating bidirectionality in XML? FAQ
- What should I do if punctuation characters don't appear in the right place in mixed direction text?
- To correctly format bidi text in XHTML or HTML pages, should I use Unicode control codes or markup?
FAQ
- What directions are commonly localized languages written in? FAQ
- How do I implement vertical text?
- How do I implement horizontal text within vertical (tate chu yoko)?
- What techniques specific to Japanese typography are supported in CSS?
- Can I use kashida-based justification for arabic?
- What is the best way to express emphasis in xhtml?
- How do I use styling to change text format on a language by language basis?
- How do I use styling to change the quotation marks on a language by language basis?
- What is 'ruby'? FAQ
- How do I render ruby text?
- Why should I use caution with <br/> and equivalent elements?
- Why should I use caution with text-transform in CSS and XSL?
- How do I assign non-ASCII labels to ordered lists? test
- What do I need to do to ensure that I can easily localise tables for bidi scripts?
- How does the charset attribute work on the HTML a tag, and when should I use it?
- How do I handle the new International Domain Names in my pages?
- What are the key things I should do to ensure that my graphics will work well for all countries?
- Should I be using text in graphics?
- Do I need to worry about mirroring my images when I produce BiDi and Latin pages from the same template?
- Which icons and symbols are the most universally understood?
- How do I create multilingual graphics and animations in SVG and SMIL?
- I don't want to offend people from country X. Are there colors or images I should avoid in my web pages?
- Is it a good idea to use the HTTP Accept-Language header to determine the locale of the user?
FAQ
- How do I do culturally-sensitive sorting in XSL?
- How do I use the function format-number() in XSL?
- How do I use the <xsl:number/> element in XSL?
- Do I have to support other number formats? How can I do this?
- Is there a HTML tag that converts currency from on country to another?
- To what extent does my commerce web site need to handle foreign currencies?
- What is the best format to store time for reuse in different locale sensitive pages?
- What is the best way to handle non-Gregorian calendar dates in my web pages?
- How do I prepare my web pages to display varying international date formats?
FAQ
- As part of a form, I have a list of terms in a drop-down box. Why are they not correctly sorted when I translate the items in the list?
FAQ
- What is the best way to deal with encoding issues in forms that may use multiple languages and scripts?
FAQ
- How does the charset attribute work in HTML?
- What happens if someone types an answer into a form using characters that are not supported by the encoding of the page?
- What encoding is used to send the results of my forms back to the server?
- What is the best approach for dealing with the wide variations in name and address formats in data input forms?
- What is the best way of ensuring that I correctly recognise a date or time entered by a user?
- How can I test that a form works internationally? What kinds of text data or scenarios should I test it with?
- What are the key recommendations for writing internationally acceptable English?
- Should we use abbreviations and acronyms?
- Is it OK to put translatable text in style sheets?
- Should I try to guess the language and/or country of a visitor to my page, or should I ask them to specify it?
- What is the best way to point to localised pages in alternative languages?
- Should I use flags?
- Are links to localised sites and links to contact information the same thing?
- For a multilingual site, should I only have one language selection page, or allow selections from anypage? If from any page, should I link
to the main page, or to a parallel page in another language
- What are the pitfalls of using the HTTP Accept Language header to determine the locale of the user?
- Should I put files in separate directories by language or not?
- How should I name my files when I have alternative language versions of the same page?
- How do I implement language negotiation on an Apache Web server?
FAQ
- My web site is in several languages and is frequently updated. How can I ensure all the links are pointing to the correct pages and
languages? (what is the easiest file/directory structure to use for this?)
- How to publish Web pages in UTF-8?
- Why should I care about this?
- What would an international test plan contain for a typical web site?