Troubles with transliteration

In theory, it is a straightforward process to transliterate words (i.e., book titles) from a language that uses a non-Latin script (e.g., Russian, Serbian, Yiddish, Tajik, etc.) into Latin letters so that they can be searched and alphabetized properly in Western library catalogs. In practice, several competing systems of transliteration are in use, and they are not always consistently applied. This means that a library user must know which system(s) of transliteration are being used in a particular library catalog (or union catalog, database, search engine, portal, etc.) and repeat, vary, or truncate their searches accordingly.

Researchers may be familiar with examples such as


, but consider the title of the following newspaper, published in Moscow by the Russian Orthodox Church from 1880-1917:

1. Московскія церковныя вѣдомости

(pre-1918 & émigré Russian orthography, as it actually appeared)

2. Московские церковные ведомости

(post-1918 Russian orthography, commonly found in footnotes)

3. Moskovskīi︠a︡ t︠s︡erkovnyi︠a︡ vi︠e︡domosti

(ALA-Library of Congress transliteration of #1)

4. Moskovskie t︠s︡erkovnye vedomosti

(ALA-Library of Congress transliteration of #2)

5. Moskovskija cerkovnyja vjedomosti

(DIN transliteration of #1--this system is commonly used in Germany)

6. Moskovskie cerkovnye vedomosti

(DIN transliteration of #2)

7. Moskovskìâ cerkovnyâ vědomosti

(ISO transliteration of #1--this system is commonly used in E. Europe)

8. Moskovskie cerkovnye vedomosti

(ISO transliteration of #2)

(Note: if the diacritical marks in the above examples are not displaying correctly, it may be because your web browser does not recognize the characters. For more on this problem, see below; for a more accurate view of the diacritics, try viewing this page in Firefox rather than Internet Explorer.)

From 1869 to 1879, this newspaper was published under the title Московскія епархиальныя вѣдомости. The second word of this title provides another good example of the variation that must be taken into account when searching North American and European library catalogs. (The orthographies/transliterations appear in the same order as in the previous example.)

1. епархиальныя

2. епархиальные

3. eparkhialʹnyi︠a︡

4. eparkhialʹnye

5. eparchialʹnyja

6. eparchialʹnye

7. eparhialʹnyâ

8. eparhialʹnye

In order to perform a comprehensive online search (encompassing, for example, WorldCat, Karlsruhe KVK, The European Library, and RIBK), all of these permutations must be used. Leibniz Universität Hannover has posted a table of the major transliteration systems at http://www.unics.uni-hannover.de/ntr/russisch/umschrifttabelle.html which is useful for the post-1918 Russian alphabet, but unfortunately does not include pre-1918/émigré letters such as

PDF files containing transliteration systems, brief explanatory notes, and references for dozens of languages are available at http://transliteration.eki.ee/, although the compiler stresses that his work is not meant to be taken as authoritative.

In recent years, some library catalogs in Western Europe and North America have developed the ability to display non-Latin titles and other bibliographic information in the original script (usually accompanied by a transliterated version). Thus it is possible in a very limited number of cases to dispense with transliteration systems altogether, and conduct searches directly in Cyrillic, Arabic, Chinese, etc. At the present time, however, only a tiny fraction of Western library catalog records include this information, so users will continue to need to transliterate their search terms for the foreseeable future.

Two other facts should also be noted: 1) In general, the diacritical marks that accompany some transliterated letters (i.e., ĭ, ž, i︠a︡ , etc.) can be ignored when searching. For example, "okti︠a︡ brʹskiĭ" (the ALA-LC version of "октябрьский") may simply be entered as "oktiabrskii", and "ŝik" (the ISO version of "ящик") can be searched as "asik". 2) The interaction between various encoding systems often causes diacritical marks to display incorrectly when printed, e-mailed, or otherwise viewed by any means other than that by which they were entered (even the diacritics in this post may not display correctly on all computers, and several of them had to be converted into image files just to enable this post to be published). For example, "Kievskai︠a︡" might appear as "Kievskai?a?", "Kievskaikila", "Kievskain+áan+í", "", "Kievskaiï¸ a︡",

"Kievskai&# 65056;a&# 65057;",

, or other unhelpful combinations. This is also true of languages (German, French, Czech, etc.) that are written in the Latin script but include phonetically-significant diacritics such as

which may appear as strange symbols, Greek letters, or shapes when cut and pasted into Word documents, e-mail messages, etc.

No comments: