git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@1231 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
		
			
				
	
	
		
			4962 lines
		
	
	
		
			188 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			4962 lines
		
	
	
		
			188 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
<HTML>
 | 
						|
<HEAD>
 | 
						|
<!-- This HTML file has been created by texi2html 1.51
 | 
						|
     from gettext.texi on 4 September 1998 -->
 | 
						|
 | 
						|
<TITLE>GNU gettext utilities</TITLE>
 | 
						|
</HEAD>
 | 
						|
<BODY>
 | 
						|
<H1>GNU gettext tools, version 0.10</H1>
 | 
						|
<H2>Native Language Support Library and Tools</H2>
 | 
						|
<H2>Edition 0.10, 26 November</H2>
 | 
						|
<ADDRESS>Ulrich Drepper</ADDRESS>
 | 
						|
<ADDRESS>Jim Meyering</ADDRESS>
 | 
						|
<ADDRESS>Pinard</ADDRESS>
 | 
						|
<P>
 | 
						|
<P><HR><P>
 | 
						|
 | 
						|
<P>
 | 
						|
Copyright (C) 1995 Free Software Foundation, Inc.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Permission is granted to make and distribute verbatim copies of
 | 
						|
this manual provided the copyright notice and this permission notice
 | 
						|
are preserved on all copies.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Permission is granted to copy and distribute modified versions of this
 | 
						|
manual under the conditions for verbatim copying, provided that the entire
 | 
						|
resulting derived work is distributed under the terms of a permission
 | 
						|
notice identical to this one.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Permission is granted to copy and distribute translations of this manual
 | 
						|
into another language, under the above conditions for modified versions,
 | 
						|
except that this permission notice may be stated in a translation approved
 | 
						|
by the Foundation.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC1" HREF="gettext_toc.html#TOC1">Introduction</A></H1>
 | 
						|
 | 
						|
 | 
						|
<BLOCKQUOTE>
 | 
						|
<P>
 | 
						|
This manual is still in <EM>DRAFT</EM> state.  Some sections are still
 | 
						|
empty, or almost.  We keep merging material from other sources
 | 
						|
(essentially email folders) while the proper integration of this
 | 
						|
material is delayed.
 | 
						|
</BLOCKQUOTE>
 | 
						|
 | 
						|
<P>
 | 
						|
In this manual, we use <EM>he</EM> when speaking of the programmer or
 | 
						|
maintainer, <EM>she</EM> when speaking of the translator, and <EM>they</EM>
 | 
						|
when speaking of the installers or end users of the translated program.
 | 
						|
This is only a convenience for clarifying the documentation.  It is
 | 
						|
absolutely not meant to imply that some roles are more appropriate
 | 
						|
to males or females.  Besides, as you might guess, GNU <CODE>gettext</CODE>
 | 
						|
is meant to be useful for people using computers, whatever their sex,
 | 
						|
race, religion or nationality!
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This chapter explains what are the goals seeked by the mere existence
 | 
						|
of GNU <CODE>gettext</CODE>.  Then, it explains a few wide concepts around
 | 
						|
Native Language Support, and situates message translation in regard
 | 
						|
to other aspects of national and cultural variance, as applicable
 | 
						|
to programs.  It also surveys what are those files used to convey
 | 
						|
translations.  It explains how the various tools interrelate in the
 | 
						|
initial generation for these files, and later, how the maintenance
 | 
						|
cycle usually operate.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC2" HREF="gettext_toc.html#TOC2">The Purpose of GNU <CODE>gettext</CODE></A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Usually, programs are written and documented in English, and use
 | 
						|
English at execution time for interacting with users.  This is true
 | 
						|
not only from within GNU, but also in a great deal of commercial
 | 
						|
and free software.  Using a common language is quite handy for
 | 
						|
communication between developers, maintainers and users from all
 | 
						|
countries.  On the other hand, most people are less comfortable with
 | 
						|
English than with their own native language, and would rather prefer
 | 
						|
using their mother tongue for day to day's work, as far as possible.
 | 
						|
Many would simply <EM>love</EM> seeing their computer screen showing
 | 
						|
a lot less of English, and far more of their own spoken language.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
However, to some people, this dream might appear so far fetched that
 | 
						|
they may believe it is not even worth spending time thinking about
 | 
						|
it, and they have no confidence at all that the dream might ever
 | 
						|
become true.  Many did not loose hope yet, and organized themselves.
 | 
						|
The GNU Translation Project is a formalization of this hope into a
 | 
						|
workable structure, which has a good chance to get all of us nearer
 | 
						|
the achievement of a truly multi-lingual set of programs.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
GNU <CODE>gettext</CODE> is an important step for the GNU Translation
 | 
						|
Project, as it is an asset on which we may build many other steps.
 | 
						|
This package offers to programmers, translators and even users, a
 | 
						|
well integrated set of tools and documentation.  Specifically, the GNU
 | 
						|
<CODE>gettext</CODE> utilities are a set of tools that provides a framework
 | 
						|
to help other GNU packages produce multi-lingual messages.  These tools
 | 
						|
include a set of conventions about how programs should be written to
 | 
						|
support message catalogs, a directory and file naming organization
 | 
						|
for the message catalogs themselves, a runtime library supporting the
 | 
						|
retrieval of translated messages, and a few stand-alone programs to
 | 
						|
massage in various ways the sets of translatable strings, or already
 | 
						|
translated strings.  A special GNU Emacs mode also helps interested
 | 
						|
parties into preparing these sets, or bringing them up to date.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
GNU <CODE>gettext</CODE> is designed so it minimizes the impact of
 | 
						|
internationalization on program sources, keeping this impact as small
 | 
						|
and hardly noticeable as possible.  Internationalization has better
 | 
						|
chances of succeeding if it is very light weighted, or at least,
 | 
						|
appear to be so, when looking at program sources.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The GNU Translation Project also uses the GNU <CODE>gettext</CODE>
 | 
						|
distribution as a vehicle for documenting its structure and methods,
 | 
						|
even if this goes beyond the technicalities of the GNU <CODE>gettext</CODE>
 | 
						|
proper.  By doing so, translators will find in a single place, as
 | 
						|
far as possible, all they need to know for properly doing their
 | 
						|
translating work.  Also, this supplementary documentation might also
 | 
						|
help programmers, and even curious users, at understanding how GNU
 | 
						|
<CODE>gettext</CODE> is related to the remainder of the GNU Translation
 | 
						|
Project, and consequently, have a glimpse at the <EM>big picture</EM>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC3" HREF="gettext_toc.html#TOC3">I18n, L10n, and Such</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Two long words appear all the time when we discuss support of native
 | 
						|
language in programs, and these words have a precise meaning, worth
 | 
						|
being explained here, once and for all in this document.  The words are
 | 
						|
<EM>internationalization</EM> and <EM>localization</EM>.  Many people,
 | 
						|
tired of writing these long words over and over again, took the
 | 
						|
habit of writing <STRONG>i18n</STRONG> and <STRONG>l10n</STRONG> instead, quoting the first
 | 
						|
and last letter of each word, and replacing the run of intermediate
 | 
						|
letters by a number merely telling how many such letters there are.
 | 
						|
But in this manual, in the sake of clarity, we will patiently write
 | 
						|
the names in full, each time...
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
By <STRONG>internationalization</STRONG>, one refers to the operation by which a
 | 
						|
program, or a set of programs turned into a package, is made aware and
 | 
						|
able to support multiple languages.  This is a generalization process,
 | 
						|
by which the programs are untied from using only English strings or
 | 
						|
other English specific habits, and connected to generic ways of doing
 | 
						|
the same, instead.  Program developers may use various techniques to
 | 
						|
internationalize their programs, some of them have been standardized.
 | 
						|
GNU <CODE>gettext</CODE> offers one of these standards.  See section <A HREF="gettext.html#SEC36">The Programmer's View</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
By <STRONG>localization</STRONG>, one means the operation by which, in a set
 | 
						|
of programs already internationalized, one gives the program all
 | 
						|
needed information so that it can bend itself to handle its input
 | 
						|
and output in a fashion which is correct for some native language and
 | 
						|
cultural habits.  This is a particularisation process, by which generic
 | 
						|
methods already implemented in an internationalized program are used
 | 
						|
in specific ways.  The programming environment puts several functions
 | 
						|
to the programmers disposal which allow this runtime configuration.
 | 
						|
The formal description of specific set of cultural habits for some
 | 
						|
country, together with all associated translations targeted to the
 | 
						|
same native language, is called the <STRONG>locale</STRONG> for this language
 | 
						|
or country.  Users achieve localization of programs by setting proper
 | 
						|
values to special environment variables, prior to executing those
 | 
						|
programs, identifying which locale should be used.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
In fact, locale message support is only one component of the cultural
 | 
						|
data that makes up a particular locale.  There are a whole host of
 | 
						|
routines and functions provided to aid programmers in developing
 | 
						|
internationalized software and which allows them to access the data
 | 
						|
stored in a particular locale.  When someone presently refers to a
 | 
						|
particular locale, they are obviously referring to the data stored
 | 
						|
within that particular locale.  Similarly, if a programmer is referring
 | 
						|
to "accessing the locale routines", they are referring to the
 | 
						|
complete suite of routines that access all of the locale's information.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
One uses the expression <STRONG>Native Language Support</STRONG>, or merely NLS,
 | 
						|
for speaking of the overall activity or feature encompassing both
 | 
						|
internationalization and localization, allowing for multi-lingual
 | 
						|
interactions in a program.  In a nutshell, one could say that
 | 
						|
internationalization is the operation by which further localizations
 | 
						|
are made possible.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Also, very roughly said, when it comes to multi-lingual messages,
 | 
						|
internationalization is usually taken care of by programmers, and
 | 
						|
localization is usually taken care of by translators.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC4" HREF="gettext_toc.html#TOC4">Aspects in Native Language Support</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
For a totally multi-lingual distribution, there are many things to
 | 
						|
translate beyond output messages.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<UL>
 | 
						|
<LI>
 | 
						|
 | 
						|
As of today, GNU <CODE>gettext</CODE> offers a complete toolset for
 | 
						|
translating messages output by C programs.  Perl scripts and shell
 | 
						|
scripts also need to be translated.  Even if there are some hooks
 | 
						|
so this can be done, these hooks are not integrated as well as they
 | 
						|
should be.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Some programs, like <CODE>autoconf</CODE> or <CODE>bison</CODE>, are able
 | 
						|
to produce other programs (or scripts).  Even if the generating
 | 
						|
programs themselves are internationalized, the generated programs they
 | 
						|
produce may need internationalization on their own, and this indirect
 | 
						|
internationalization could be automated right from the generating
 | 
						|
program.  In fact, quite usually, generating and generated programs
 | 
						|
could be internationalized independently, as the effort needed is
 | 
						|
fairly orthogonal.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
A few programs include textual tables which might need translation
 | 
						|
themselves, independently of the strings contained in the program
 | 
						|
itself.  For example, RFC 1345 gives an English description for each
 | 
						|
character which GNU <CODE>recode</CODE> is able to reconstruct at execution.
 | 
						|
Since these descriptions are extracted from the RFC by mechanical means,
 | 
						|
translating them properly would require a prior translation of the RFC
 | 
						|
itself.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Almost all programs accept options, which are often worded out so to
 | 
						|
be descriptive for the English readers; one might want to consider
 | 
						|
offering translated versions for program options as well.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Many programs read, interpret, compile, or are somewhat driven by
 | 
						|
input files which are texts containing keywords, identifiers, or
 | 
						|
replies which are inherently translatable.  For example, one may want
 | 
						|
<CODE>gcc</CODE> to allow diacriticized characters in identifiers or use
 | 
						|
translated keywords; <SAMP>`rm -i'</SAMP> might accept something else than
 | 
						|
<SAMP>`y'</SAMP> or <SAMP>`n'</SAMP> for replies, etc.  Even if the program will
 | 
						|
eventually make most of its output in the foreign languages, one has
 | 
						|
to decide whether the input syntax, option values, etc., are to be
 | 
						|
localized or not.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
The manual accompanying a package, as well as all documentation files
 | 
						|
in the distribution, could surely be translated, too.  Translating a
 | 
						|
manual, with the intent of later keeping up with updates, is a major
 | 
						|
undertaking in itself, generally.
 | 
						|
 | 
						|
</UL>
 | 
						|
 | 
						|
<P>
 | 
						|
As we already stressed, translation is only one aspect of locales.
 | 
						|
Other internationalization aspects are not currently handled by GNU
 | 
						|
<CODE>gettext</CODE>, but perhaps may be handled in future versions.  There
 | 
						|
are many attributes that are needed to define a country's cultural
 | 
						|
conventions.  These attributes include beside the country's native
 | 
						|
language, the formatting of the date and time, the representation of
 | 
						|
numbers, the symbols for currency, etc.  These local <STRONG>rules</STRONG> are
 | 
						|
termed the country's locale.  The locale represents the knowledge
 | 
						|
needed to support the country's native attributes.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
There are a few major areas which may vary between countries and
 | 
						|
hence, define what a locale must describe.  The following list helps
 | 
						|
putting multi-lingual messages into the proper context of other tasks
 | 
						|
related to locales, and also presents some other areas which GNU
 | 
						|
<CODE>gettext</CODE> might eventually tackle, maybe, one of these days.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><EM>Characters and Codesets</EM>
 | 
						|
<DD>
 | 
						|
The codeset most commonly used through out the USA and most English
 | 
						|
speaking parts of the world is the ASCII codeset.  However, there are
 | 
						|
many characters needed by various locales that are not found within
 | 
						|
this codeset.  The 8-bit ISO 8859-1 code set has most of the special
 | 
						|
characters needed to handle the major European languages.  However, in
 | 
						|
many cases, the ISO 8859-1 font is not adequate.  Hence each locale
 | 
						|
will need to specify which codeset they need to use and will need
 | 
						|
to have the appropriate character handling routines to cope with
 | 
						|
the codeset.
 | 
						|
 | 
						|
<DT><EM>Currency</EM>
 | 
						|
<DD>
 | 
						|
The symbols used vary from country to country as does the position
 | 
						|
used by the symbol.  Software needs to be able to transparently
 | 
						|
display currency figures in the native mode for each locale.
 | 
						|
 | 
						|
<DT><EM>Dates</EM>
 | 
						|
<DD>
 | 
						|
The format of date varies between locales.  For example, Christmas day
 | 
						|
in 1994 is written as 12/25/94 in the USA and as 25/12/94 in Australia.
 | 
						|
Other countries might use ISO 8061 dates, etc.
 | 
						|
 | 
						|
Time of the day may be noted as <VAR>hh</VAR>:<VAR>mm</VAR>, <VAR>hh</VAR>.<VAR>mm</VAR>,
 | 
						|
or otherwise.  Some locales require time to be specified in 24-hour
 | 
						|
mode rather than as AM or PM.  Further, the nature and yearly extent
 | 
						|
of the Daylight Saving correction vary widely between countries.
 | 
						|
 | 
						|
<DT><EM>Numbers</EM>
 | 
						|
<DD>
 | 
						|
Numbers can be represented differently in different locales.
 | 
						|
For example, the following numbers are all written correctly for
 | 
						|
their respective locales:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
12,345.67       English
 | 
						|
12.345,67       French
 | 
						|
1,2345.67       Asia
 | 
						|
</PRE>
 | 
						|
 | 
						|
Some programs could go further and use different unit systems, like
 | 
						|
English units or Metric units, or even take into account variants
 | 
						|
about how numbers are spelled in full.
 | 
						|
 | 
						|
<DT><EM>Messages</EM>
 | 
						|
<DD>
 | 
						|
The most obvious area is the language support within a locale.  This is
 | 
						|
where GNU <CODE>gettext</CODE> provide an ease for developers and users to
 | 
						|
easily change the language that the software uses to communicate to
 | 
						|
the user.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
In the near future we see no chance that beside message handling
 | 
						|
more components of locale will be made available for use in other
 | 
						|
GNU packages.  The reason for this is that most modern system provide
 | 
						|
a more or less reasonable support for at least some of the missing
 | 
						|
components.  Another point is that the GNU libc and Linux will get
 | 
						|
a new and complete implementation of the whole locale functionality
 | 
						|
which could be adopted by system lacking a reasonable locale support.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC5" HREF="gettext_toc.html#TOC5">Files Conveying Translations</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The letters PO in <TT>`.po'</TT> files means Portable Object, to
 | 
						|
distinguish it from <TT>`.mo'</TT> files, where MO stands for Machine
 | 
						|
Object.  This paradigm, as well as the PO file format, is inspired
 | 
						|
by the NLS standard developed by Uniforum, and implemented by Sun
 | 
						|
in their Solaris system.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
PO files are meant to be read and edited by humans, and associate each
 | 
						|
original, translatable string of a given package with its translation
 | 
						|
in a particular target language.  A single PO file is dedicated to
 | 
						|
a single target language.  If a package supports many languages,
 | 
						|
there is one such PO file per language supported, and each package
 | 
						|
has its own set of PO files.  These PO files are best created by
 | 
						|
the <CODE>xgettext</CODE> program, and later updated or refreshed through
 | 
						|
the <CODE>tupdate</CODE> program.  Program <CODE>xgettext</CODE> extracts all
 | 
						|
marked messages from a set of C files and initializes a PO file with
 | 
						|
empty translations.  Program <CODE>tupdate</CODE> takes care of adjusting
 | 
						|
PO files between releases of the corresponding sources, commenting
 | 
						|
obsolete entries, initializing new ones, and updating all source
 | 
						|
line references.  Files ending with <TT>`.pot'</TT> are kind of base
 | 
						|
translation files found in distributions, in PO file format, and
 | 
						|
<TT>`.pox'</TT> files are often temporary PO files.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
MO files are meant to be read by programs, and are binary in nature.
 | 
						|
A few systems already offer tools for creating and handling MO files
 | 
						|
as part of the Native Language Support coming with the system, but the
 | 
						|
format of these MO files is often different from system to system,
 | 
						|
and non-portable.  They do not necessary use <TT>`.mo'</TT> for file
 | 
						|
extensions, but since system libraries are also used for accessing
 | 
						|
these files, it works as long as the system is self-consistent about
 | 
						|
it.  If GNU <CODE>gettext</CODE> is able to interface with the tools already
 | 
						|
provided with systems, it will consequently let these provided tools
 | 
						|
take care of generating the MO files.  Or else, if such tools are not
 | 
						|
found or do not seem usable, GNU <CODE>gettext</CODE> will use its own ways
 | 
						|
and its own format for MO files.  Files ending with <TT>`.gmo'</TT> are
 | 
						|
really MO files, when it is known that these files use the GNU format.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC6" HREF="gettext_toc.html#TOC6">Overview of GNU <CODE>gettext</CODE></A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The following diagram summarizes the relation between the files
 | 
						|
handled by GNU <CODE>gettext</CODE> and the tools acting on these files.
 | 
						|
It is followed by a somewhat detailed explanations, which you should
 | 
						|
read while keeping an eye on the diagram.  Having a clear understanding
 | 
						|
of these interrelations would surely help programmers, translators
 | 
						|
and maintainers.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
Original C Sources ---> PO mode ---> Marked C Sources ---.
 | 
						|
                                                         |
 | 
						|
              .---------<--- GNU gettext Library         |
 | 
						|
.--- make <---+                                          |
 | 
						|
|             `---------<--------------------+-----------'
 | 
						|
|                                            |
 | 
						|
|   .-----<--- PACKAGE.pot <--- xgettext <---'   .---<--- PO Compendium
 | 
						|
|   |                                            |             ^
 | 
						|
|   |                                            `---.         |
 | 
						|
|   `---.                                            +---> PO mode ---.
 | 
						|
|       +----> tupdate -------> LANG.pox --->--------'                |
 | 
						|
|   .---'                                                             |
 | 
						|
|   |                                                                 |
 | 
						|
|   `-------------<---------------.                                   |
 | 
						|
|                                 +--- LANG.po <--- New LANG.pox <----'
 | 
						|
|   .--- LANG.gmo <--- msgfmt <---'
 | 
						|
|   |
 | 
						|
|   `---> install ---> /.../LANG/PACKAGE.mo ---.
 | 
						|
|                                              +---> "Hello world!"
 | 
						|
`-------> install ---> /.../bin/PROGRAM -------'
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The indication <SAMP>`PO mode'</SAMP> appears in two places in this picture,
 | 
						|
and you may safely read it as merely meaning "hand editing", using
 | 
						|
any editor of your choice, really.  However, for those of you being
 | 
						|
the lucky users of GNU Emacs, PO mode has been specifically created
 | 
						|
for providing a cosy environment for editing or modifying PO files.
 | 
						|
While editing a PO file, PO mode allows for the easy browsing of
 | 
						|
auxiliary and compendium PO files, as well as following references into
 | 
						|
the set of C program sources from which PO files has been derived.
 | 
						|
It has a few special features, among which the interactive marking
 | 
						|
of program strings as translatable, and the validatation of PO files
 | 
						|
with easy repositioning to PO file lines showing errors.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
As a programmer, the first step into bringing GNU <CODE>gettext</CODE>
 | 
						|
into your package is identifying, right in the C sources, which
 | 
						|
strings are meant to be translatable, and which are untranslatable.
 | 
						|
This tedious job can be done a little more comfortably using PO
 | 
						|
mode, but you can use any means being usual to you for modifying your
 | 
						|
C sources.  Some other simple, standard changes are also needed to
 | 
						|
properly initialize the translation library.  See section <A HREF="gettext.html#SEC13">Preparing Program Sources</A>, for
 | 
						|
more information about all this.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Once the C sources have been modified, the <CODE>xgettext</CODE> program
 | 
						|
is used to find and extract all translatable strings, and create an
 | 
						|
initial PO file out of all these.  This <TT>`<VAR>package</VAR>.pot'</TT> file
 | 
						|
contains all original program strings, it has sets of pointers to
 | 
						|
exactly where in C sources each string is used, and all translations
 | 
						|
are set to empty.  The letter <KBD>t</KBD> in <TT>`.pot'</TT> marks that this is
 | 
						|
a Template PO file, not yet oriented towards any particular language.
 | 
						|
See section <A HREF="gettext.html#SEC19">Invoking the <CODE>xgettext</CODE> Program</A>, for more details about how one calls the
 | 
						|
<CODE>xgettext</CODE> program.  If you are <EM>really</EM> lazy, you might
 | 
						|
be interested at working a lot more right away, and preparing the
 | 
						|
whole distribution setup (see section <A HREF="gettext.html#SEC65">The Maintainer's View</A>).  By doing so, you
 | 
						|
spare typing the <CODE>xgettext</CODE> command yourself, as <CODE>make</CODE>
 | 
						|
should now generate the proper things automatically for you!
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The first time through, there is no <TT>`<VAR>lang</VAR>.po'</TT> yet, so the
 | 
						|
<CODE>tupdate</CODE> step may be skipped and replaced by a mere copy of
 | 
						|
<TT>`<VAR>package</VAR>.pot'</TT> to <TT>`<VAR>lang</VAR>.pox'</TT>, where <VAR>lang</VAR>
 | 
						|
represents the target language.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Then comes the initial translation of messages.  Translation in
 | 
						|
itself is a whole matter, still exclusively meant for humans,
 | 
						|
and whose complexity far overwhelms the level of this manual.
 | 
						|
Nevertheless, a few hints are given in some other chapter of this
 | 
						|
manual (see section <A HREF="gettext.html#SEC54">The Translator's View</A>).  You will also find there indications
 | 
						|
about how to contact translating teams, or becoming part of them,
 | 
						|
for sharing your translating concerns with others who target the same
 | 
						|
native language.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
While adding the translated messages into the <TT>`<VAR>lang</VAR>.pox'</TT>
 | 
						|
PO file, if you do not have GNU Emacs handy, you are on your own
 | 
						|
for ensuring that your fully respect the PO file format, and quoting
 | 
						|
conventions (see section <A HREF="gettext.html#SEC9">The Format of PO Files</A>).  This is surely not an impossible task,
 | 
						|
as this is the way many people handled PO files already for Uniforum or
 | 
						|
Solaris.  On the other hand, using PO mode in GNU Emacs, most details
 | 
						|
of PO file format are taken care for you, but you have to acquire
 | 
						|
some familiarity with PO mode itself.  Besides main PO mode commands
 | 
						|
(see section <A HREF="gettext.html#SEC10">Main Commands</A>), you should know how to move between entries
 | 
						|
(see section <A HREF="gettext.html#SEC11">Entry Positioning</A>), and how to handle untranslated entries
 | 
						|
(see section <A HREF="gettext.html#SEC24">Untranslated Entries</A>).
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If some common translations have already been saved into a compendium
 | 
						|
PO file, translators may use PO mode for initializing untranslated
 | 
						|
entries from the compendium, and also save selected translations into
 | 
						|
the compendium, updating it (see section <A HREF="gettext.html#SEC21">Using Translation Compendiums</A>).  Compendium files
 | 
						|
are meant to be exchanged between members of a given translation team.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Programs, or packages of programs, are dynamic in nature: users write
 | 
						|
bug reports and suggestion for improvements, maintainers react by
 | 
						|
modifying programs in various ways.  The fact that a package has
 | 
						|
already been internationalized should not make maintainers shy
 | 
						|
of adding new strings, or modifying strings already translated.
 | 
						|
They just do their job the best they can.  For the GNU Translation
 | 
						|
Project to work smoothly, it is important that maintainers do not
 | 
						|
carry translation concerns on their already loaded shoulders, and that
 | 
						|
translators be kept as free as possible of programmatic concerns.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The only concern maintainers should have is carefully marking new
 | 
						|
strings are translatable, when they should be, and do not otherwise
 | 
						|
worry about them being translated, as this will come in proper time.
 | 
						|
Consequently, when programs and their strings are adjusted in various
 | 
						|
ways by maintainers, and for matters usually unrelated to translation,
 | 
						|
<CODE>xgettext</CODE> would construct <TT>`<VAR>package</VAR>.pot'</TT> files which are
 | 
						|
evolving over time, so the translations carried by <TT>`<VAR>lang</VAR>.po'</TT>
 | 
						|
are slowly fading out of date.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
It is important for translators (and even maintainers) to understand
 | 
						|
that package translation is a continuous process in the lifetime of a
 | 
						|
package, and not something which is done once and for all at the start.
 | 
						|
After an initial burst of translation activity for a given package,
 | 
						|
interventions are needed once in a while, because here and there,
 | 
						|
translated entries become obsolete, and new untranslated entries
 | 
						|
appear, needing translation.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The <CODE>tupdate</CODE> program has the purpose of refreshing an already
 | 
						|
existing <TT>`<VAR>lang</VAR>.po'</TT> file, by comparing it with a newer
 | 
						|
<TT>`<VAR>package</VAR>.pot'</TT> template file, extracted by <CODE>xgettext</CODE>
 | 
						|
out of recent C sources.  The refreshing operation adjusts all
 | 
						|
references to C source locations for strings, since these strings
 | 
						|
move as programs are modified.  Also, <CODE>tupdate</CODE> comments out as
 | 
						|
obsolete, in <TT>`<VAR>lang</VAR>.pox'</TT>, those already translated entries
 | 
						|
which are no longer used in the program sources (see section <A HREF="gettext.html#SEC25">Obsolete Entries</A>.  It finally discovers new strings and insert them in
 | 
						|
the resulting PO file as untranslated entries (see section <A HREF="gettext.html#SEC24">Untranslated Entries</A>.  See section <A HREF="gettext.html#SEC23">Invoking the <CODE>tupdate</CODE> Program</A>, for more information about what
 | 
						|
<CODE>tupdate</CODE> really does.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Whatever route or means taken, the goal is obtaining an updated
 | 
						|
<TT>`<VAR>lang</VAR>.pox'</TT> file offering translations for all strings.
 | 
						|
When this is properly achieved, this file <TT>`<VAR>lang</VAR>.pox'</TT> may
 | 
						|
take the place of the previous official <TT>`<VAR>lang</VAR>.po'</TT> file.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The time mobility, or fluidity of PO files, is an integral part of
 | 
						|
the translation game, and should be well understood, and accepted.
 | 
						|
People resisting it will have a hard time participating in the GNU
 | 
						|
Translation Project, or will give a hard time to other participants!
 | 
						|
In particular, maintainers should relax and include all available PO
 | 
						|
files in their distributions, even if these have not recently been
 | 
						|
updated, without banging or otherwise trying to exert pressure on the
 | 
						|
translator teams to get the job done.  The pressure should rather
 | 
						|
come from the community of users speaking a particular language,
 | 
						|
and maintainers should consider themselves fairly relieved of any
 | 
						|
concern about the adequacy of translation files.  On the other hand,
 | 
						|
translators should reasonably try updating the PO files they are
 | 
						|
responsible for, while the package is undergoing pretest, prior to
 | 
						|
an official distribution.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Once the PO file is complete and dependable, the <CODE>msgfmt</CODE> program
 | 
						|
is used for turning the PO file into a machine-oriented format, which
 | 
						|
may yield efficient retrieval of translations by the programs of the
 | 
						|
package, whenever needed at runtime (see section <A HREF="gettext.html#SEC31">The Format of GNU MO Files</A>).  See section <A HREF="gettext.html#SEC30">Invoking the <CODE>msgfmt</CODE> Program</A>, for more information about all modalities of execution
 | 
						|
for the <CODE>msgfmt</CODE> program.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Finally, the modified and marked C sources are compiled and linked
 | 
						|
with the GNU <CODE>gettext</CODE> library, usually through the operation of
 | 
						|
<CODE>make</CODE>, given a suitable <TT>`Makefile'</TT> exists for the project,
 | 
						|
and the resulting executable is installed somewhere users will find it.
 | 
						|
The MO files themselves should also be properly installed.  Given the
 | 
						|
appropriate environment variables are set (see section <A HREF="gettext.html#SEC35">Magic for End Users</A>), the
 | 
						|
program should localize itself automatically, whenever it executes.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The remaining of this manual has the purpose of deepening the various
 | 
						|
steps outlined in this section.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC7" HREF="gettext_toc.html#TOC7">PO Files and PO Mode Basics</A></H1>
 | 
						|
 | 
						|
<P>
 | 
						|
The GNU <CODE>gettext</CODE> toolset helps programmers and translators
 | 
						|
at producing, updating and using translation files, mainly those
 | 
						|
PO files which are textual, editable files.  This chapter insists
 | 
						|
on the format of PO files, and contains a PO mode starter.  PO mode
 | 
						|
description is spread over this manual instead of being concentrated
 | 
						|
in one place, this chapter presents only the basics of PO mode.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC8" HREF="gettext_toc.html#TOC8">Completing GNU <CODE>gettext</CODE> Installation</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Once you have received, unpacked, configured and compiled the GNU
 | 
						|
<CODE>gettext</CODE> distribution, the <SAMP>`make install'</SAMP> command puts in
 | 
						|
place the programs <CODE>xgettext</CODE>, <CODE>msgfmt</CODE>, <CODE>gettext</CODE>, and
 | 
						|
<CODE>tupdate</CODE>, as well as their available message catalogs.  For
 | 
						|
completing a comfortable installation, you might also want to make the
 | 
						|
PO mode available to your GNU Emacs users.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
To finish the installation of the PO mode, you might want modify your
 | 
						|
file <TT>`.emacs'</TT>, once and for all, so it contains a few lines looking
 | 
						|
like:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
(setq auto-mode-alist
 | 
						|
      (cons '("\\.pox?\\'" . po-mode) auto-mode-alist))
 | 
						|
(autoload 'po-mode "po-mode")
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Later, whenever you edit some <TT>`.po'</TT> or <TT>`.pox'</TT> file, Emacs
 | 
						|
loads <TT>`po-mode.elc'</TT> (or <TT>`po-mode.el'</TT>) as needed, and
 | 
						|
automatically activate PO mode commands for the associated buffer.
 | 
						|
The string <EM>PO</EM> appears in the mode line for any buffer for
 | 
						|
which PO mode is active.  Many PO files may be active at once in a
 | 
						|
single Emacs session.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC9" HREF="gettext_toc.html#TOC9">The Format of PO Files</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
A PO file is made up of many entries, each entry holding the relation
 | 
						|
between an original untranslated string and its corresponding
 | 
						|
translation.  All entries in a given PO file usually pertain
 | 
						|
to a single project, and all translations are expressed in a single
 | 
						|
target language.  One PO file <STRONG>entry</STRONG> has the following schematic
 | 
						|
structure:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
<VAR>white-space</VAR>
 | 
						|
#  <VAR>translator-comments</VAR>
 | 
						|
#. <VAR>automatic-comments</VAR>
 | 
						|
#: <VAR>reference</VAR>...
 | 
						|
msgid <VAR>untranslated-string</VAR>
 | 
						|
msgstr <VAR>translated-string</VAR>
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The general structure of a PO file should be well understood by
 | 
						|
the translator.  When using PO mode, very little has to be known
 | 
						|
about the format details, as PO mode takes care of them for her.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Entries begin with some optional white space.  Usually, when generated
 | 
						|
through GNU <CODE>gettext</CODE> tools, there is exactly one blank line
 | 
						|
between entries.  Then comments follow, on lines all starting with the
 | 
						|
character <KBD>#</KBD>.  There are two kinds of comments: those which have
 | 
						|
some white space immediately following the <KBD>#</KBD>, which comments are
 | 
						|
created and maintained exclusively by the translator, and those which
 | 
						|
have some non-white character just after the <KBD>#</KBD>, which comments
 | 
						|
are created and maintained automatically by GNU <CODE>gettext</CODE> tools.
 | 
						|
All comments, of any kind, are optional.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
After white space and comments, entries show two strings, giving
 | 
						|
first the untranslated string as it appears in the original program
 | 
						|
sources, and then, the translation of this string.  The original
 | 
						|
string is introduced by the keyword <CODE>msgid</CODE>, and the translation,
 | 
						|
by <CODE>msgstr</CODE>.  The two strings, untranslated and translated,
 | 
						|
are quoted in various ways in the PO file, using <KBD>"</KBD>
 | 
						|
delimiters and <KBD>\</KBD> escapes, but the translator does not really
 | 
						|
have to pay attention to the precise quoting format, as PO mode fully
 | 
						|
intend to take care of quoting for her.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The <CODE>msgid</CODE> strings, as well as automatic comments, are produced
 | 
						|
and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not
 | 
						|
provide means for the translator to alter these.  The most she can
 | 
						|
do is merely deleting them, and only by deleting the whole entry.
 | 
						|
On the other hand, the <CODE>msgstr</CODE> string, as well as translator
 | 
						|
comments, are really meant for the translator, and PO mode gives her
 | 
						|
the full control she needs.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
It happens that some lines, usually whitespace or comments, follow the
 | 
						|
very last entry of a PO file.  Such lines are not part of any entry,
 | 
						|
and PO mode is unable to take action on those lines.  By using the
 | 
						|
PO mode function <KBD>M-x po-normalize</KBD>, the translator may get
 | 
						|
rid of those spurious lines.  See section <A HREF="gettext.html#SEC12">Normalizing Strings in Entries</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The remainder of this section may be safely skipped for those using
 | 
						|
PO mode, yet it may be interesting for everybody to have a better
 | 
						|
idea of the precise format of a PO file.  On the other hand, those
 | 
						|
not having GNU Emacs handy should carefully continue reading on.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects
 | 
						|
the C syntax for a character string, including the surrounding quotes
 | 
						|
and imbedded backslashed escape sequences.  When the time comes
 | 
						|
to write multi-line strings, one should not use escaped newlines.
 | 
						|
Instead, a closing quote should follow the last character on the
 | 
						|
line to be continued, and an opening quote should resume the string
 | 
						|
at the beginning of the following PO file line.  For example:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
msgid ""
 | 
						|
"Here is an example of how one might continue a very long string\n"
 | 
						|
"for the common case the string represents multi-line output.\n"
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
In this example, the empty string is used on the first line, for
 | 
						|
allowing the better alignment of the <KBD>H</KBD> from the word <SAMP>`Here'</SAMP>
 | 
						|
over the <KBD>f</KBD> from the word <SAMP>`for'</SAMP>.  In this example, the
 | 
						|
<CODE>msgid</CODE> keyword is followed by three strings, which are meant
 | 
						|
to be concatenated.  Concatenating the empty string does not change
 | 
						|
the resulting overall string, but it is a way for us to comply with
 | 
						|
the necessity of <CODE>msgid</CODE> to be followed by a string on the same
 | 
						|
line, while keeping the multi-line presentation left-justified, as
 | 
						|
we find this to be cleaner disposition.  The empty string could have
 | 
						|
been omitted, but only if the string starting with <SAMP>`Here'</SAMP> was
 | 
						|
promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF1" HREF="gettext_foot.html#FOOT1">(1)</A> It was not really necessary
 | 
						|
either to switch between the two last quoted strings immediately after
 | 
						|
the newline <SAMP>`\n'</SAMP>, the switch could have occurred after <EM>any</EM>
 | 
						|
other character, we just did it this way because it is neater.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
One should carefully distinguish between end of lines marked as
 | 
						|
<SAMP>`\n'</SAMP> <EM>inside</EM> quotes, which are part of the represented
 | 
						|
string, and end of lines in the PO file itself, outside string quotes,
 | 
						|
which have no incidence on the represented string.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Outside strings, white lines and comments may be used freely.
 | 
						|
Comments start at the beginning of a line with <SAMP>`#'</SAMP> and extend
 | 
						|
until the end of the PO file line.  Comments written by translators
 | 
						|
should have the initial <SAMP>`#'</SAMP> immediately followed by some white
 | 
						|
space.  If the <SAMP>`#'</SAMP> is not immediately followed by white space,
 | 
						|
this comment is most likely generated and managed by specialized GNU
 | 
						|
tools, and might disappear or be replaced unexpectandly when the PO
 | 
						|
file is given to <CODE>tupdate</CODE>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC10" HREF="gettext_toc.html#TOC10">Main Commands</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
When Emacs finds a PO file in a window, PO mode is activated
 | 
						|
for that window.  This puts the window read-only and establishes a
 | 
						|
po-mode-map, which is a genuine Emacs mode, in that way that it is
 | 
						|
not derived from text mode in any way.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The main PO commands are those who do not fit in the other categories in
 | 
						|
subsequent sections, they allow for quitting PO mode or managing windows
 | 
						|
in special ways.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>u</KBD>
 | 
						|
<DD>
 | 
						|
Undo last modification to the PO file.
 | 
						|
 | 
						|
<DT><KBD>q</KBD>
 | 
						|
<DD>
 | 
						|
Quit processing and save the PO file.
 | 
						|
 | 
						|
<DT><KBD>o</KBD>
 | 
						|
<DD>
 | 
						|
Temporary leave the PO file window.
 | 
						|
 | 
						|
<DT><KBD>h</KBD>
 | 
						|
<DD>
 | 
						|
Show help about PO mode.
 | 
						|
 | 
						|
<DT><KBD>=</KBD>
 | 
						|
<DD>
 | 
						|
Give some PO file statistics.
 | 
						|
 | 
						|
<DT><KBD>v</KBD>
 | 
						|
<DD>
 | 
						|
Batch validate the format of the whole PO file.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
The command <KBD>u</KBD> (<CODE>po-undo</CODE>) interfaces to the GNU Emacs
 | 
						|
<EM>undo</EM> facility.  See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>.  Each time <KBD>u</KBD> is typed, modifications the translator
 | 
						|
did to the PO file are undone a little more.  For the purpose of
 | 
						|
undoing, each PO mode command is atomic.  This is especially true for
 | 
						|
the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single
 | 
						|
use of this command is undone at once, even if the edition itself
 | 
						|
implied several actions.  However, while in the editing window, one
 | 
						|
can undo the edition work quite parsimoniously.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>q</KBD> (<CODE>po-quit</CODE>) is used when the translator is
 | 
						|
done with the PO file.  If the file has been modified, it is saved
 | 
						|
on disk first.  However, prior to all this, the command checks if
 | 
						|
some untranslated message remains in the PO file and, if yes, the
 | 
						|
translator is asked if she really wants to leave working with this
 | 
						|
PO file.  This is the preferred way of getting rid of an Emacs PO
 | 
						|
file buffer.  Merely killing it through the usual command <KBD>C-x
 | 
						|
k</KBD> (<CODE>kill-buffer</CODE>), say, has the unnice effect of leaving a PO
 | 
						|
internal work buffer behind.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>o</KBD> (<CODE>po-other-window</CODE>) is another, softer
 | 
						|
way, to leave PO mode, temporarily.  It just moves the cursor in
 | 
						|
some other Emacs window, and pops one if necessary.  For example, if
 | 
						|
the translator just got PO mode to show some source context in some
 | 
						|
other, she might discover some apparent bug in the program source
 | 
						|
that needs correction.  This command allows the translator to change
 | 
						|
sex, become a programmer, and have the cursor right into the window
 | 
						|
containing the program she (or rather <EM>he</EM>) wants to modify.
 | 
						|
By later getting the cursor back in the PO file window, or by
 | 
						|
asking Emacs to edit this file once again, PO mode is then recovered.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all
 | 
						|
available PO mode commands.  The translator should then type any
 | 
						|
character to resume normal PO mode operations.  The command <KBD>?</KBD>
 | 
						|
has the same effect as <KBD>h</KBD>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number
 | 
						|
of entries in the PO file, the ordinal of the current entry
 | 
						|
(counted from 1), the number of untranslated entries, the number of
 | 
						|
obsolete entries, and displays all these numbers.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>v</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in
 | 
						|
verbose mode over the current PO file.  This command first offers
 | 
						|
to save the current PO file on disk.  The <CODE>msgfmt</CODE> tool, from
 | 
						|
GNU <CODE>gettext</CODE>, has the purpose of creating an MO file out of a
 | 
						|
PO file, and PO mode uses the features of this program for checking
 | 
						|
the overall format of a PO file, as well as all individual entries.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so
 | 
						|
the translator regains control immediately while her PO file
 | 
						|
is being studied.  Error output is collected in the GNU Emacs
 | 
						|
<SAMP>`*compilation*'</SAMP> buffer, displayed in another window.  The regular
 | 
						|
GNU Emacs command <KBD>C-x`</KBD> (<CODE>next-error</CODE>), as well as other
 | 
						|
usual compile commands, allow the translator to reposition quickly to
 | 
						|
the offending parts of the PO file.  Once the cursor on the line in
 | 
						|
error, the translator may decide for any PO mode action which would
 | 
						|
help correcting the error.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC11" HREF="gettext_toc.html#TOC11">Entry Positioning</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The cursor in a PO file window is almost always part of
 | 
						|
an entry.  The only exceptions are the special case when the cursor
 | 
						|
is after the last entry in the file, or when the PO file is
 | 
						|
empty.  The entry where the cursor is found to be is said to be the
 | 
						|
current entry.  Many PO mode commands operate on the current entry,
 | 
						|
so moving the cursor does more than allowing the translator to browse
 | 
						|
the PO file, this also selects on which entry commands operate.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Some PO mode commands alter the position of the cursor in a specialized
 | 
						|
way.  A few of those special purpose positioning are described here,
 | 
						|
the others are described in following sections.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>.</KBD>
 | 
						|
<DD>
 | 
						|
Redisplay the current entry.
 | 
						|
 | 
						|
<DT><KBD>n</KBD>
 | 
						|
<DD>
 | 
						|
<DT><KBD>SPC</KBD>
 | 
						|
<DD>
 | 
						|
Select the entry after the current one.
 | 
						|
 | 
						|
<DT><KBD>p</KBD>
 | 
						|
<DD>
 | 
						|
<DT><KBD>DEL</KBD>
 | 
						|
<DD>
 | 
						|
Select the entry before the current one.
 | 
						|
 | 
						|
<DT><KBD><</KBD>
 | 
						|
<DD>
 | 
						|
Select the first entry in the PO file.
 | 
						|
 | 
						|
<DT><KBD>></KBD>
 | 
						|
<DD>
 | 
						|
Select the last entry in the PO file.
 | 
						|
 | 
						|
<DT><KBD>m</KBD>
 | 
						|
<DD>
 | 
						|
Record the location of the current entry for later use.
 | 
						|
 | 
						|
<DT><KBD>l</KBD>
 | 
						|
<DD>
 | 
						|
Return to a previously saved entry location.
 | 
						|
 | 
						|
<DT><KBD>x</KBD>
 | 
						|
<DD>
 | 
						|
Exchange the current entry location with the previously saved one.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
Any GNU Emacs command able to reposition the cursor may be used
 | 
						|
to select the current entry in PO mode, including commands which
 | 
						|
move by characters, lines, paragraphs, screens or pages, and search
 | 
						|
commands.  However, there is a kind of standard way to display the
 | 
						|
current entry in PO mode, which usual GNU Emacs commands moving
 | 
						|
the cursor do not especially try to enforce.  The command <KBD>.</KBD>
 | 
						|
(<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the
 | 
						|
current entry properly, after the current entry has been changed by
 | 
						|
means external to PO mode, or the Emacs screen otherwise altered.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
It is yet to decide if PO mode would help the translator, or otherwise
 | 
						|
irritate her, by forcing a more fixed window disposition while she
 | 
						|
is doing her work.  We originally had quite precise ideas about
 | 
						|
how windows should behave, but on the other hand, anyone used to
 | 
						|
GNU Emacs is often happy to keep full control.  Maybe a fixed window
 | 
						|
disposition might be offered as a PO mode option that the translator
 | 
						|
might activate or deactivate at will, so it could be offered on an
 | 
						|
experimental basis.  If nobody feels a real need for using it, or
 | 
						|
a compulsion for writing it, we might as well drop this whole idea.
 | 
						|
The incentive for doing it should come from translators rather than
 | 
						|
programmers, as opinions from an experienced translator are surely
 | 
						|
more worth to me than opinions from programmers <EM>thinking</EM> about
 | 
						|
how <EM>others</EM> should do translation.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD>
 | 
						|
(<CODE>po-previous-entry</CODE>) move the cursor the entry following,
 | 
						|
or preceding, the current one.  If <KBD>n</KBD> is given while the
 | 
						|
cursor is on the last entry of the PO file, or if <KBD>p</KBD>
 | 
						|
is given while the cursor is on the first entry, no move is done.
 | 
						|
<KBD><KBD>SPC</KBD></KBD> and <KBD><KBD>DEL</KBD></KBD> are alternate keys for <KBD>n</KBD> and
 | 
						|
<KBD>p</KBD>, respectively.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The commands <KBD><</KBD> (<CODE>po-first-entry</CODE>) and <KBD>></KBD>
 | 
						|
(<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last
 | 
						|
entry, of the PO file.  When the cursor is located past the last
 | 
						|
entry in a PO file, most PO mode commands will return an error saying
 | 
						|
<SAMP>`After last entry'</SAMP>.  However, the commands <KBD><</KBD> and <KBD>></KBD>
 | 
						|
have the special property of being able to work even when the cursor
 | 
						|
is not into some PO file entry, and you may use them for nicely
 | 
						|
correcting this situation.  But even these commands will fail on a
 | 
						|
truly empty PO file.  There are development plans for PO mode for it
 | 
						|
to interactively fill an empty PO file from sources.  See section <A HREF="gettext.html#SEC16">Marking Translatable Strings</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The translator may decide, before working at the translation of
 | 
						|
a particular entry, that she needs browsing the remainder of the
 | 
						|
PO file, maybe for finding the terminology or phraseology used
 | 
						|
in related entries.  She can of course use the standard Emacs idioms
 | 
						|
for saving the current cursor location in some register, and use that
 | 
						|
register for getting back, or else, to use the location ring.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
PO mode offers another approach, by which cursor locations may be saved
 | 
						|
onto a special stack.  The command <KBD>m</KBD> (<CODE>po-push-location</CODE>)
 | 
						|
merely adds the location of current entry to the stack, pushing
 | 
						|
the already saved locations under the new one.  The command
 | 
						|
<KBD>l</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and
 | 
						|
reposition the cursor to the entry associated with that top element.
 | 
						|
This position is then lost, for the next <KBD>l</KBD> will move the cursor
 | 
						|
to the previously saved location, and so on until locations remain
 | 
						|
on the stack.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If the translator wants the position to be kept on the location stack,
 | 
						|
maybe for taking a mere look at the entry associated with the top
 | 
						|
element, then go elsewhere with the intent of getting back later, she
 | 
						|
ought to use <KBD>m</KBD> immediately after <KBD>l</KBD>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously
 | 
						|
reposition the cursor to the entry associated with the top element of
 | 
						|
the stack of saved locations, and replace that top element with the
 | 
						|
location of the current entry before the move.  Consequently, repeating
 | 
						|
the <KBD>x</KBD> command toggles alternatively between two entries.
 | 
						|
For achieving this, the translator will position the cursor on the
 | 
						|
first entry, use <KBD>m</KBD>, then position to the second entry, and
 | 
						|
merely use <KBD>x</KBD> for making the switch.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC12" HREF="gettext_toc.html#TOC12">Normalizing Strings in Entries</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
There are many different ways for encoding a particular string into a
 | 
						|
PO file entry, because there are so many different ways to split and
 | 
						|
quote multi-line strings, and even, to represent special characters
 | 
						|
by backslahsed escaped sequences.  Some features of PO mode rely on
 | 
						|
the ability for PO mode to scan an already existing PO file for a
 | 
						|
particular string encoded into the <CODE>msgid</CODE> field of some entry.
 | 
						|
Even if PO mode has internally all the built-in machinery for
 | 
						|
implementing this recognition easily, doing it fast is technically
 | 
						|
difficult.  For facilitating a solution to this efficiency problem,
 | 
						|
we decided for a canonical representation for strings.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
A conventional representation of strings in a PO file is currently
 | 
						|
under discussion, and PO mode experiments a canonical representation.
 | 
						|
Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform
 | 
						|
way of representing equivalent strings would be useful, as the internal
 | 
						|
normalization needed by PO mode could be automatically satisfied
 | 
						|
when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>.  An explicit
 | 
						|
PO mode normalization should then be only necessary for PO files
 | 
						|
imported from elsewhere, or for when the convention itself evolves.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
So, for achieving normalization of at least the strings of a given
 | 
						|
PO file needing a canonical representation, the following PO mode
 | 
						|
command is available:
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>M-x po-normalize</KBD>
 | 
						|
<DD>
 | 
						|
Tidy the whole PO file by making entries more uniform.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
The special command <KBD>M-x po-normalize</KBD>, which has no associate
 | 
						|
keys, revises all entries, ensuring that strings of both original
 | 
						|
and translated entries use uniform internal quoting in the PO file.
 | 
						|
It also removes any crumb after the last entry.  This command may be
 | 
						|
useful for PO files freshly imported from elsewhere, or if we ever
 | 
						|
improve on the canonical quoting format we use.  This canonical format
 | 
						|
is not only meant for getting cleaner PO files, but also for greatly
 | 
						|
speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
<KBD>M-x po-normalize</KBD> presently makes three passes over the entries.
 | 
						|
The first implements heuristics for converting PO files for GNU
 | 
						|
<CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE>
 | 
						|
fields were using K&R style C string syntax for multi-line strings.
 | 
						|
These heuristics may fail for comments not related to obsolete
 | 
						|
entries and ending with a backslash; they also depend on subsequent
 | 
						|
passes for finalizing the proper commenting of continued lines for
 | 
						|
obsolete entries.  This first pass might disappear once all oldish PO
 | 
						|
files would have been adjusted.  The second and third pass normalize
 | 
						|
all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively.  They also
 | 
						|
clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE>
 | 
						|
for continued lines.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Having such an explicit normalizing command allows for importing PO
 | 
						|
files from other sources, but also eases the evolution of the current
 | 
						|
convention, evolution driven mostly by aesthetic concerns, as of now.
 | 
						|
It is all easy to make suggested adjustments at a later time, as the
 | 
						|
normalizing command and eventually, other GNU <CODE>gettext</CODE> tools
 | 
						|
should greatly automate conformance.  A description of the canonical
 | 
						|
string format is given below, for the particular benefit of those not
 | 
						|
having GNU Emacs handy, and who would nevertheless want to handcraft
 | 
						|
their PO files in nice ways.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Right now, in PO mode, strings are single line or multi-line.  A string
 | 
						|
goes multi-line if and only if it has <EM>embedded</EM> newlines, that
 | 
						|
is, if it matches <SAMP>`[^\n]\n+[^\n]'</SAMP>.  So, we would have:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
msgstr "\n\nHello, world!\n\n\n"
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
but, replacing the space by a newline, this becomes:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
msgstr ""
 | 
						|
"\n"
 | 
						|
"\n"
 | 
						|
"Hello,\n"
 | 
						|
"world!\n"
 | 
						|
"\n"
 | 
						|
"\n"
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
We are deliberately using a caricatural example, here, to make the
 | 
						|
point clearer.  Usually, multi-lines are not that bad looking.
 | 
						|
It is probable that we will implement the following suggestion.
 | 
						|
We might lump together all initial newlines into the empty string,
 | 
						|
and also all newlines introducing empty lines (that is, for <VAR>n</VAR>
 | 
						|
> 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate
 | 
						|
string), so making the previous example appear:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
msgstr "\n\n"
 | 
						|
"Hello,\n"
 | 
						|
"world!\n"
 | 
						|
"\n\n"
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
There are a few yet undecided little points about string normalization,
 | 
						|
to be documented in this manual, once these questions settle.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC13" HREF="gettext_toc.html#TOC13">Preparing Program Sources</A></H1>
 | 
						|
 | 
						|
<P>
 | 
						|
For the programmer, changes to the C source code fall into three
 | 
						|
categories.  First, you have to make the localization functions
 | 
						|
known to all modules needing message translation.  Second, you should
 | 
						|
properly trigger the operation of GNU <CODE>gettext</CODE> when the program
 | 
						|
initializes, usually from the <CODE>main</CODE> function.  Last, you should
 | 
						|
identify and especially mark all constant strings in your program
 | 
						|
needing translation.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Presuming that your set of programs, or package, has been adjusted
 | 
						|
so all needed GNU <CODE>gettext</CODE> files are available, and your
 | 
						|
<TT>`Makefile'</TT> files are adjusted (see section <A HREF="gettext.html#SEC65">The Maintainer's View</A>), each C module
 | 
						|
having translated C strings should contain the line:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#include <libintl.h>
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The remaining changes to your C sources are discussed in the further
 | 
						|
sections of this chapter.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC14" HREF="gettext_toc.html#TOC14">Triggering <CODE>gettext</CODE> Operations</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The initialization of locale data should be done with more or less
 | 
						|
the same code in every program, as demonstrated below:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
int
 | 
						|
main (argc, argv)
 | 
						|
     int argc;
 | 
						|
     char argv;
 | 
						|
{
 | 
						|
  ...
 | 
						|
  setlocale (LC_ALL, "");
 | 
						|
  bindtextdomain (PACKAGE, LOCALEDIR);
 | 
						|
  textdomain (PACKAGE);
 | 
						|
  ...
 | 
						|
}
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
<VAR>PACKAGE</VAR> and <VAR>LOCALEDIR</VAR> should be provided either by
 | 
						|
<TT>`config.h'</TT> or by the Makefile.  For now consult the <CODE>gettext</CODE>
 | 
						|
sources for more information.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The use of <CODE>LC_ALL</CODE> might not be appropriate for you.
 | 
						|
<CODE>LC_ALL</CODE> includes all locale categories and especially
 | 
						|
<CODE>LC_CTYPE</CODE>.  This later category is responsible for determining
 | 
						|
character classes with the <CODE>isalnum</CODE> etc. functions from
 | 
						|
<TT>`ctype.h'</TT> which could especially for programs, which process some
 | 
						|
kind of input language, be wrong.  For example this would mean that a
 | 
						|
source code using the  (cedille character) is runnable in
 | 
						|
France but not in the U.S.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
So it is sometimes necessary to replace the <CODE>LC_ALL</CODE> line in the
 | 
						|
code above by a sequence of <CODE>setlocale</CODE> lines
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
{
 | 
						|
  ...
 | 
						|
  setlocale (LC_TIME, "");
 | 
						|
  setlocale (LC_MESSAGES, "");
 | 
						|
  ...
 | 
						|
}
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
or to switch for and back to the character class in question.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC15" HREF="gettext_toc.html#TOC15">How Marks Appears in Sources</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The C sources should mark all strings requiring translation.  Marking
 | 
						|
is done in such a way that each translatable string appears to be
 | 
						|
the sole argument of some function or preprocessor macro.  There are
 | 
						|
only a few such possible functions or macros meant for translation,
 | 
						|
and their names are said to be marking keywords.  The marking is
 | 
						|
attached to strings themselves, rather than to what we do with them.
 | 
						|
This approach has more uses.  A blatant example is an error message
 | 
						|
produced by formatting.  The format string needs translation, as
 | 
						|
well as some strings inserted through some <SAMP>`%s'</SAMP> specification
 | 
						|
in the format, while the result from <CODE>sprintf</CODE> may have so many
 | 
						|
different instances that it is unpractical to list them all in some
 | 
						|
<SAMP>`error_string_out()'</SAMP> routine, say.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This marking operation has two goals.  The first goal of marking
 | 
						|
is for triggering the retrieval of the translation, at run time.
 | 
						|
The keyword are possibly resolved into a routine able to dynamically
 | 
						|
return the proper translation, as far as possible or wanted, for the
 | 
						|
argument string.  Most localizable strings are found into executable
 | 
						|
positions, that is, affected to variables or given as parameter to
 | 
						|
functions.  But this is not universal usage, and some translatable
 | 
						|
strings appear in structured initializations.  See section <A HREF="gettext.html#SEC17">Special Cases of Translatable Strings</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The second goal of the marking operation is to help <CODE>xgettext</CODE>
 | 
						|
at properly extracting all translatable strings when it scans a set
 | 
						|
of program sources and produces PO file templates.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The canonical keyword for marking translatable strings is
 | 
						|
<SAMP>`gettext'</SAMP>, it gave its name to the whole GNU <CODE>gettext</CODE>
 | 
						|
package.  For packages making only light use of the <SAMP>`gettext'</SAMP>
 | 
						|
keyword, macro or function, it is easily used <EM>as is</EM>.  However,
 | 
						|
for packages using the <CODE>gettext</CODE> interface more heavily, it
 | 
						|
is usually more convenient giving the main keyword a shorter, less
 | 
						|
obtrusive name.  Indeed, the keyword might appear on a lot of strings
 | 
						|
all over the package, and programmers usually do not want nor need
 | 
						|
that their program sources remind them loud, all the time, that they
 | 
						|
are internationalized.  Further, a long keyword has the disadvantage
 | 
						|
of using more horizontal space, forcing more indentation work on
 | 
						|
sources for those trying to keep them within 79 or 80 columns.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Many GNU packages use <SAMP>`_'</SAMP> (a simple underline) as a keyword,
 | 
						|
and write <SAMP>`_("Translatable string")'</SAMP> instead of <SAMP>`gettext
 | 
						|
("Translatable string")'</SAMP>.  Further, the usual GNU coding rule
 | 
						|
wanting that there is a space between the keyword and the opening
 | 
						|
parenthesis is relaxed, in practice, for this particular usage.
 | 
						|
So, the textual overhead per translatable string is reduced to
 | 
						|
only three characters: the underline and the two parentheses.
 | 
						|
However, even if GNU <CODE>gettext</CODE> uses this convention internally,
 | 
						|
it does not offer it officially.  The real, genuine keyword is truly
 | 
						|
<SAMP>`gettext'</SAMP> indeed.  It is fairly easy for those wanting to use
 | 
						|
<SAMP>`_'</SAMP> instead of <SAMP>`gettext'</SAMP> to declare:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#include <libintl.h>
 | 
						|
#define _(String) gettext (String)
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
instead of merely using <SAMP>`#include <libintl.h>'</SAMP>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Later on, the maintenance is relatively easy.  If, as a programmer,
 | 
						|
you add or modify a string, you will have to ask yourself if the
 | 
						|
new or altered string requires translation, and include it within
 | 
						|
<SAMP>`_()'</SAMP> if you think it should be translated.  <SAMP>`"%s: %d"'</SAMP> is
 | 
						|
an example of string <EM>not</EM> requiring translation!
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC16" HREF="gettext_toc.html#TOC16">Marking Translatable Strings</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
In PO mode, one set of features is meant more for the programmer than
 | 
						|
for the translator, and allows him to interactively mark which strings,
 | 
						|
in a set of program sources, are translatable, and which are not.
 | 
						|
Even if it is a fairly easy job for a programmer to find and mark
 | 
						|
such strings by other means, using any editor of his choice, PO mode
 | 
						|
makes this work more comfortable.  Further, this gives translators
 | 
						|
who feel a little like programmers, or programmers who feel a little
 | 
						|
like translators, a tool letting them work at marking translatable
 | 
						|
strings in the program sources, while simultaneously producing a set of
 | 
						|
translation in some language, for the package being internationalized.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The set of program sources, aimed by the PO mode commands describe
 | 
						|
here, should have an Emacs tags table constructed for your project,
 | 
						|
prior to using these PO file commands.  This is easy to do.  In any
 | 
						|
shell window, change the directory to the root of your project, then
 | 
						|
execute a command resembling:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
etags src/*.[hc] lib/*.[hc]
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
presuming here you want to process all <TT>`.h'</TT> and <TT>`.c'</TT> files
 | 
						|
from the <TT>`src/'</TT> and <TT>`lib/'</TT> directories.  This command will
 | 
						|
explore all said files and create a <TT>`TAGS'</TT> file in your root
 | 
						|
directory, somewhat summarizing the contents using a special file
 | 
						|
format Emacs can understand.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
For official GNU packages which follow the GNU coding standard there is
 | 
						|
a make goal <CODE>tags</CODE> or <CODE>TAGS</CODE> which construct the tag files in
 | 
						|
all directories and for all files containing source code.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Once your <TT>`TAGS'</TT> file is ready, the following commands assist
 | 
						|
the programmer at marking translatable strings in his set of sources.
 | 
						|
But these commands are necessarily driven from within a PO file
 | 
						|
window, and it is likely that you do not even have such a PO file yet.
 | 
						|
This is not a problem at all, as you may safely open a new, empty PO
 | 
						|
file, mainly for using these commands.  This empty PO file will slowly
 | 
						|
fill in while you mark strings as translatable in your program sources.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>,</KBD>
 | 
						|
<DD>
 | 
						|
Search through program sources for a string which looks like a
 | 
						|
candidate for translation.
 | 
						|
 | 
						|
<DT><KBD>M-,</KBD>
 | 
						|
<DD>
 | 
						|
Mark the last string found with <SAMP>`_()'</SAMP>.
 | 
						|
 | 
						|
<DT><KBD>M-.</KBD>
 | 
						|
<DD>
 | 
						|
Mark the last string found with a keyword taken from a set of possible
 | 
						|
keywords.  This command with a prefix allows some management of these
 | 
						|
keywords.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
The <KBD>,</KBD> (<CODE>po-tags-search</CODE>) command search for the next
 | 
						|
occurrence of a string which looks like a possible candidate for
 | 
						|
translation, and displays the program source in another Emacs window,
 | 
						|
positioned in such a way that the string is near the top of this other
 | 
						|
window.  If the string is to big to fit whole in this window, it is
 | 
						|
rather positioned so only its end is shown.  In any case, the cursor
 | 
						|
is left in the PO file window.  If the shown string would be better
 | 
						|
presented differently in different native languages, you may mark it
 | 
						|
using <KBD>M-,</KBD> or <KBD>M-.</KBD>.  Otherwise, you might rather ignore it
 | 
						|
and skip to the next string by merely repeating the <KBD>,</KBD> command.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
A string is a good candidate for translation if it contains a sequence
 | 
						|
of three or more letters.  A string containing at most two letters in
 | 
						|
a row will be considered as a candidate if it has more letters than
 | 
						|
non-letters.  The command disregards strings containing no letters,
 | 
						|
or isolated letters only.  It also disregards strings within comments,
 | 
						|
or strings already marked with some keyword PO mode knows (see below).
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If you have never told Emacs about some <TT>`TAGS'</TT> file to use, the
 | 
						|
command will request that you specify one from the minibuffer, the
 | 
						|
first time you use the command.  You may later change your <TT>`TAGS'</TT>
 | 
						|
file by using the regular Emacs command <KBD>M-x visit-tags-table</KBD>,
 | 
						|
which will ask you to name the precise <TT>`TAGS'</TT> file you want
 | 
						|
to use.  See section `Tag Tables' in <CITE>The Emacs Editor</CITE>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Each time you use the <KBD>,</KBD> command, the search resumes where it was
 | 
						|
left over by the previous search, and goes through all program sources,
 | 
						|
obeying the <TT>`TAGS'</TT> file, until all sources have been processed.
 | 
						|
However, by giving a prefix argument to the command (<KBD>C-u
 | 
						|
,)</KBD>, you may request that the search be restarted all over again
 | 
						|
from the first program source; but in this case, strings that you
 | 
						|
recently marked as translatable will be automatically skipped.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Using this <KBD>,</KBD> command does not prevent using of other regular
 | 
						|
Emacs tags commands.  For example, regular <CODE>tags-search</CODE> or
 | 
						|
<CODE>tags-query-replace</CODE> commands may be used without disrupting the
 | 
						|
independent <KBD>,</KBD> search sequence.  However, as implemented, the
 | 
						|
<EM>initial</EM> <KBD>,</KBD> command (or the <KBD>,</KBD> command is used with a
 | 
						|
prefix) might also reinitialize the regular Emacs tags searching to the
 | 
						|
first tags file, this reinitialization might be considered spurious.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The <KBD>M-,</KBD> (<CODE>po-mark-translatable</CODE>) command will mark the
 | 
						|
recently found string with the <SAMP>`_'</SAMP> keyword.  The <KBD>M-.</KBD>
 | 
						|
(<CODE>po-select-mark-and-mark</CODE>) command will request that you type
 | 
						|
one keyword from the minibuffer and use that keyword for marking
 | 
						|
the string.  Both commands will automatically create a new PO file
 | 
						|
untranslated entry for the string being marked, and make it the
 | 
						|
current entry (making it easy for you to immediately proceed to its
 | 
						|
translation, if you feel like doing it right away).  It is possible
 | 
						|
that the modifications made to the program source by <KBD>M-,</KBD> or
 | 
						|
<KBD>M-.</KBD> render some source line longer than 80 columns, forcing you
 | 
						|
to break and re-indent this line differently.  You may use the <KBD>o</KBD>
 | 
						|
command from PO mode, or any other window changing command from
 | 
						|
GNU Emacs, to break out into the program source window, and do any
 | 
						|
needed adjustments.  You will have to use some regular Emacs command
 | 
						|
to return the cursor to the PO file window, if you want commanding
 | 
						|
<KBD>,</KBD> for the next string, say.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The <KBD>M-.</KBD> command has a few built-in speedups, so you do not
 | 
						|
have to explicitly type all keywords all the time.  The first such
 | 
						|
speedup is that you are presented with a <EM>preferred</EM> keyword,
 | 
						|
which you may accept by merely typing <KBD><KBD>RET</KBD></KBD> at the prompt.
 | 
						|
The second speedup is that you may type any non-ambiguous prefix of the
 | 
						|
keyword you really mean, and the command will complete it automatically
 | 
						|
for you.  This also means that PO mode has to <EM>know</EM> all
 | 
						|
your possible keywords, and that it will not accept mistyped keywords.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If you reply <KBD>?</KBD> to the keyword request, the command gives a
 | 
						|
list of all known keywords, from which you may choose.  When the
 | 
						|
command is prefixed by an argument (<KBD>C-u M-.</KBD>), it inhibits
 | 
						|
updating any program source or PO file buffer, and does some simple
 | 
						|
keyword management instead.  In this case, the command asks for a
 | 
						|
keyword, written in full, which becomes a new allowed keyword for
 | 
						|
later <KBD>M-.</KBD> commands.  Moreover, this new keyword automatically
 | 
						|
becomes the <EM>preferred</EM> keyword for later commands.  By typing
 | 
						|
an already known keyword in response to <KBD>C-u M-.</KBD>, one merely
 | 
						|
changes the <EM>preferred</EM> keyword and does nothing more.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
All keywords known for <KBD>M-.</KBD> are recognized by the <KBD>,</KBD> command
 | 
						|
when scanning for strings, and strings already marked by any of those
 | 
						|
known keywords are automatically skipped.  If many PO files are opened
 | 
						|
simultaneously, each one has its own independent set of known keywords.
 | 
						|
There is no provision in PO mode, currently, for deleting a known
 | 
						|
keyword, you have to quit the file (maybe using <KBD>q</KBD>) and reopen
 | 
						|
it afresh.  When a PO file is newly brought up in an Emacs window, only
 | 
						|
<SAMP>`gettext'</SAMP> and <SAMP>`_'</SAMP> are known as keywords, and <SAMP>`gettext'</SAMP>
 | 
						|
is preferred for the <KBD>M-.</KBD> command.  In fact, this is not useful to
 | 
						|
prefer <SAMP>`_'</SAMP>, as this one is already built in the <KBD>M-,</KBD> command.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC17" HREF="gettext_toc.html#TOC17">Special Cases of Translatable Strings</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The attentive reader might now point out that it is not always possible
 | 
						|
to mark translatable string with <CODE>gettext</CODE> or something like this.
 | 
						|
Consider the following case:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
{
 | 
						|
  static const char *messages[] = {
 | 
						|
    "some very meaningful message",
 | 
						|
    "and another one"
 | 
						|
  };
 | 
						|
  const char *string;
 | 
						|
  ...
 | 
						|
  string
 | 
						|
    = index > 1 ? "a default message" : messages[index];
 | 
						|
 | 
						|
  fputs (string);
 | 
						|
  ...
 | 
						|
}
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
While it is no problem to mark the string <CODE>"a default message"</CODE> it
 | 
						|
is not possible to mark the string initializers for <CODE>messages</CODE>.
 | 
						|
What is to do?  We have to fulfill two tasks.  First we have to mark the
 | 
						|
strings so that the <CODE>xgettext</CODE> program (see section <A HREF="gettext.html#SEC19">Invoking the <CODE>xgettext</CODE> Program</A>)
 | 
						|
can find them, and second we have to translate the string at runtime
 | 
						|
before printing them.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The first task can be fulfilled by creating a new keyword, which names a
 | 
						|
no-op.  For the second we have to mark all access points to a string
 | 
						|
from the array.  So one solution can look like this:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#define gettext_noop(String) (String)
 | 
						|
 | 
						|
{
 | 
						|
  static const char *messages[] = {
 | 
						|
    gettext_noop ("some very meaningful message"),
 | 
						|
    gettext_noop ("and another one")
 | 
						|
  };
 | 
						|
  const char *string;
 | 
						|
  ...
 | 
						|
  string
 | 
						|
    = index > 1 ? gettext ("a default message") : gettext (messages[index]);
 | 
						|
 | 
						|
  fputs (string);
 | 
						|
  ...
 | 
						|
}
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Please convince yourself that the string which is written by
 | 
						|
<CODE>fputs</CODE> is translated in any case.  How to get <CODE>xgettext</CODE> know
 | 
						|
the additional keyword <CODE>gettext_noop</CODE> is explained in section <A HREF="gettext.html#SEC19">Invoking the <CODE>xgettext</CODE> Program</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The above is of course not the only solution.  You could also come along
 | 
						|
with the following one:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#define gettext_noop(String) (String)
 | 
						|
 | 
						|
{
 | 
						|
  static const char *messages[] = {
 | 
						|
    gettext_noop ("some very meaningful message",
 | 
						|
    gettext_noop ("and another one")
 | 
						|
  };
 | 
						|
  const char *string;
 | 
						|
  ...
 | 
						|
  string
 | 
						|
    = index > 1 ? gettext_noop ("a default message") : messages[index];
 | 
						|
 | 
						|
  fputs (gettext (string));
 | 
						|
  ...
 | 
						|
}
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
But this has some drawbacks.  First the programmer has to take care that
 | 
						|
he uses <CODE>gettext_noop</CODE> for the string <CODE>"a default message"</CODE>.
 | 
						|
A use of <CODE>gettext</CODE> could have in rare cases unpredictable results.
 | 
						|
The second reason is found in the internals of the GNU <CODE>gettext</CODE>
 | 
						|
Library which will make this solution less efficient.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
One advantage is that you need not make control flow analysis to make
 | 
						|
sure the output is really translated in any case.  But this analysis is
 | 
						|
generally not very difficult.  If it should be in any situation you can
 | 
						|
use this second method in this situation.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC18" HREF="gettext_toc.html#TOC18">Making the Initial PO File</A></H1>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC19" HREF="gettext_toc.html#TOC19">Invoking the <CODE>xgettext</CODE> Program</A></H2>
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
xgettext [<VAR>option</VAR>] <VAR>inputfile</VAR> ...
 | 
						|
</PRE>
 | 
						|
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><SAMP>`-a'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--extract-all'</SAMP>
 | 
						|
<DD>
 | 
						|
Extract all strings.
 | 
						|
 | 
						|
<DT><SAMP>`-c [<VAR>tag</VAR>]'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--add-comments[=<VAR>tag</VAR>]'</SAMP>
 | 
						|
<DD>
 | 
						|
Place comment block with <VAR>tag</VAR> (or those preceding keyword lines)
 | 
						|
in output file.
 | 
						|
 | 
						|
<DT><SAMP>`-C'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--c++'</SAMP>
 | 
						|
<DD>
 | 
						|
Recognize C++ style comments.
 | 
						|
 | 
						|
<DT><SAMP>`-d <VAR>name</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--default-domain=<VAR>name</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
Use <TT>`<VAR>name</VAR>.po'</TT> for output (instead of <TT>`messages.po'</TT>).
 | 
						|
 | 
						|
<DT><SAMP>`-D <VAR>directory</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--directory=<VAR>directory</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
Change to <VAR>directory</VAR> before beginning to search and scan source
 | 
						|
files.  The resulting <TT>`.po'</TT> file will be written relative to the
 | 
						|
original directory, though.
 | 
						|
 | 
						|
<DT><SAMP>`-f <VAR>file</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--files-from=<VAR>file</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
Read the names of the input files from <VAR>file</VAR> instead of getting
 | 
						|
them from the command line.
 | 
						|
 | 
						|
<DT><SAMP>`-h'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--help'</SAMP>
 | 
						|
<DD>
 | 
						|
Display this help and exit.
 | 
						|
 | 
						|
<DT><SAMP>`-I <VAR>list</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--input-path=<VAR>list</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
List of directories searched for input files.
 | 
						|
 | 
						|
<DT><SAMP>`-j'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--join-existing'</SAMP>
 | 
						|
<DD>
 | 
						|
Join messages with existing file.
 | 
						|
 | 
						|
<DT><SAMP>`-k <VAR>word</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--keyword[=<VAR>word</VAR>]'</SAMP>
 | 
						|
<DD>
 | 
						|
Additonal keyword to be looked for (without <VAR>word</VAR> means not to
 | 
						|
use default keywords).
 | 
						|
 | 
						|
The default keywords, which are always looked for if not explicitly
 | 
						|
disabled, are <CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE> and
 | 
						|
<CODE>gettext_noop</CODE>.
 | 
						|
 | 
						|
<DT><SAMP>`-m [<VAR>string</VAR>]'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--msgstr-prefix[=<VAR>string</VAR>]'</SAMP>
 | 
						|
<DD>
 | 
						|
Use <VAR>string</VAR> or "" as prefix for msgstr entries.
 | 
						|
 | 
						|
<DT><SAMP>`-M [<VAR>string</VAR>]'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--msgstr-suffix[=<VAR>string</VAR>]'</SAMP>
 | 
						|
<DD>
 | 
						|
Use <VAR>string</VAR> or "" as suffix for msgstr entries.
 | 
						|
 | 
						|
<DT><SAMP>`--no-location'</SAMP>
 | 
						|
<DD>
 | 
						|
Do not write <SAMP>`#: <VAR>filename</VAR>:<VAR>line</VAR>'</SAMP> lines.
 | 
						|
 | 
						|
<DT><SAMP>`-n'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--add-location'</SAMP>
 | 
						|
<DD>
 | 
						|
Generate <SAMP>`#: <VAR>filename</VAR>:<VAR>line</VAR>'</SAMP> lines (default).
 | 
						|
 | 
						|
<DT><SAMP>`--omit-header'</SAMP>
 | 
						|
<DD>
 | 
						|
Don't write header with <SAMP>`msgid ""'</SAMP> entry.
 | 
						|
 | 
						|
This is useful for testing purposes because it eliminates a source
 | 
						|
of variance for generated <CODE>.gmo</CODE> files.  We can ship some of
 | 
						|
these files in the GNU <CODE>gettext</CODE> package, and the result of
 | 
						|
regenerating them through <CODE>msgfmt</CODE> should yield the same values.
 | 
						|
 | 
						|
<DT><SAMP>`-p <VAR>dir</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--output-dir=<VAR>dir</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
Output files will be placed in directory <VAR>dir</VAR>.
 | 
						|
 | 
						|
<DT><SAMP>`-s'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--sort-output'</SAMP>
 | 
						|
<DD>
 | 
						|
Generate sorted output and remove duplicates.
 | 
						|
 | 
						|
<DT><SAMP>`--strict'</SAMP>
 | 
						|
<DD>
 | 
						|
Write out strict Uniforum conforming PO file.
 | 
						|
 | 
						|
<DT><SAMP>`-v'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--version'</SAMP>
 | 
						|
<DD>
 | 
						|
Output version information and exit.
 | 
						|
 | 
						|
<DT><SAMP>`-x <VAR>file</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--exclude-file=<VAR>file</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
Entries from <VAR>file</VAR> are not extracted.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
Search path for supplementary PO files is:
 | 
						|
<TT>`/usr/local/share/nls/src/'</TT>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If <VAR>inputfile</VAR> is <SAMP>`-'</SAMP>, standard input is read.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This implementation of <CODE>xgettext</CODE> is able to process a few awkward
 | 
						|
cases, like strings in preprocessor macros, ANSI concatenation of
 | 
						|
adjacent strings, and escaped end of lines for continued strings.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC20" HREF="gettext_toc.html#TOC20">C Sources Context</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
PO mode is particularily powerful when used with PO files
 | 
						|
created through GNU <CODE>gettext</CODE> utilities, as those utilities
 | 
						|
insert special comments in the PO files they generate.
 | 
						|
Some of these special comments relate the PO file entry to
 | 
						|
exactly where the untranslated string appears in the program sources.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When the translator gets to an untranslated entry, she is fairly
 | 
						|
often faced with an original string which is not as informative as
 | 
						|
it normally should, being succinct, cryptic, or otherwise ambiguous.
 | 
						|
Before chosing how to translate the string, she needs to understand
 | 
						|
better what the string really means and how tight the translation has
 | 
						|
to be.  Most of times, when problems arise, the only way left to make
 | 
						|
her judgment is looking at the true program sources from where this
 | 
						|
string originated, searching for surrounding comments the programmer
 | 
						|
might have put in there, and looking around for helping clues of
 | 
						|
<EM>any</EM> kind.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Surely, when looking at program sources, the translator will receive
 | 
						|
more help if she is a fluent programmer.  However, even if she is
 | 
						|
not versed in programming and feels a little lost in C code, the
 | 
						|
translator should not be shy at taking a look, once in a while.
 | 
						|
It is most probable that she will still be able to find some of the
 | 
						|
hints she needs.  She will learn quickly to not feel uncomfortable
 | 
						|
in program code, paying more attention to programmer's comments,
 | 
						|
variable and function names (if he dared chosing them well), and
 | 
						|
overall organization, than to programmation itself.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The following commands are meant to help the translator at getting
 | 
						|
program source context for a PO file entry.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>c</KBD>
 | 
						|
<DD>
 | 
						|
Resume the display of a program source context, or cycle through them.
 | 
						|
 | 
						|
<DT><KBD>M-c</KBD>
 | 
						|
<DD>
 | 
						|
Display of a program source context selected by menu.
 | 
						|
 | 
						|
<DT><KBD>d</KBD>
 | 
						|
<DD>
 | 
						|
Add a directory to the search path for source files.
 | 
						|
 | 
						|
<DT><KBD>M-d</KBD>
 | 
						|
<DD>
 | 
						|
Delete a directory from the search path for source files.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
The commands <KBD>c</KBD> (<CODE>po-cycle-reference</CODE>) and <KBD>M-c</KBD>
 | 
						|
(<CODE>po-select-reference</CODE>) both open another window displaying
 | 
						|
some source program file, and already positioned in such a way that
 | 
						|
it shows an actual use of the current string to translate.  By doing
 | 
						|
so, the command gives source program context for the string.  But if
 | 
						|
the entry has no source context references, or if all references
 | 
						|
are unresolved along the search path for program sources, then the
 | 
						|
command diagnoses this as an error.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Even if <KBD>c</KBD> (or <KBD>M-c</KBD>) opens a new window, the cursor stays
 | 
						|
in the PO file window.  If the translator really wants to
 | 
						|
get into the program source window, she ought to do it explicitly,
 | 
						|
maybe by using command <KBD>o</KBD>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When <KBD>c</KBD> is typed for the first time, or for a PO file entry which
 | 
						|
is different of the last one used for getting source context, then the
 | 
						|
command reacts by giving the first context available for this entry,
 | 
						|
if any.  If some context has already been recently displayed for the
 | 
						|
current PO file entry, and the translator wandered to do other
 | 
						|
things, typing <KBD>c</KBD> again will merely resume, in another window,
 | 
						|
the context last displayed.  In particular, if the translator moved
 | 
						|
the cursor away from the context in the source file, the command will
 | 
						|
bring the cursor back to the context.  By using <KBD>c</KBD> many times
 | 
						|
in a row, with no interning other commands, PO mode will cycle to
 | 
						|
the next available contexts for this particular entry, getting back
 | 
						|
to the first context once the last has been shown.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>M-c</KBD> behaves differently.  Instead of cycling through
 | 
						|
references, it lets the translator choose of particular reference among
 | 
						|
many, and displays that reference.  It is best used with completion,
 | 
						|
if the translator types <KBD>TAB</KBD> immediately after <KBD>M-c</KBD>, in
 | 
						|
response to the question, she will be offered a menu of all possible
 | 
						|
references, as a reminder of which are the acceptable answers.
 | 
						|
This command is useful only where there are really many contexts
 | 
						|
available for a single string to translate.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Program source files are usually found relative to where the PO
 | 
						|
file stands.  As a special provision, when this fails, the file is
 | 
						|
also looked for, but relative to the directory immediately above it.
 | 
						|
Those two cases take proper care of most PO files.  However, it might
 | 
						|
happen that a PO file has been moved, or is edited in a different
 | 
						|
place than its normal location.  When this happens, the translator
 | 
						|
should tell PO mode in which directory normally sits the genuine PO
 | 
						|
file.  Many such directories may be specified, and all together, they
 | 
						|
constitute what is called the <STRONG>search path</STRONG> for program sources.
 | 
						|
The command <KBD>d</KBD> (<CODE>po-add-path</CODE>) is used to interactively
 | 
						|
enter a new directory at the front of the search path, and the command
 | 
						|
<KBD>M-d</KBD> (<CODE>po-delete-path</CODE>) is used to select, with completion,
 | 
						|
one of the directories she does not want anymore on the search path.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC21" HREF="gettext_toc.html#TOC21">Using Translation Compendiums</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Compendiums are yet to be implemented.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
An incoming PO mode feature will let the translator maintain a
 | 
						|
compendium of already achieved translations.  A <STRONG>compendium</STRONG>
 | 
						|
is a special PO file containing a set of translations recurring in
 | 
						|
many different packages.  The translator will be given commands for
 | 
						|
adding entries to her compendium, and later initializing untranslated
 | 
						|
entries, or updating already translated entries, from translations
 | 
						|
kept in the compendium.  For this to work, however, the compendium
 | 
						|
would have to be normalized.  See section <A HREF="gettext.html#SEC12">Normalizing Strings in Entries</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC22" HREF="gettext_toc.html#TOC22">Updating Existing PO Files</A></H1>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC23" HREF="gettext_toc.html#TOC23">Invoking the <CODE>tupdate</CODE> Program</A></H2>
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
tupdate --help
 | 
						|
tupdate --version
 | 
						|
tupdate <VAR>new</VAR> <VAR>old</VAR>
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
File <VAR>new</VAR> is the last created PO file (generally by
 | 
						|
<CODE>xgettext</CODE>).  It need not contain any translations.  File
 | 
						|
<VAR>old</VAR> is the PO file including the old translations which will
 | 
						|
be taken over to the newly created file as long as they still match.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When English messages change in the programs, this is reflected in
 | 
						|
the PO file as extracted by <CODE>xgettext</CODE>.  In large messages, that
 | 
						|
can be hard to detect, and will obviously result in an incomplete
 | 
						|
translation.  One of the virtues of <CODE>tupdate</CODE> is that it detects
 | 
						|
such changes, saving the previous translation into a PO file comment,
 | 
						|
so marking the entry as obsolete, and giving the modified string with
 | 
						|
an empty translation, that is, marking the entry as untranslated.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC24" HREF="gettext_toc.html#TOC24">Untranslated Entries</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
When <CODE>xgettext</CODE> originally creates a PO file, unless told
 | 
						|
otherwise, it initializes the <CODE>msgid</CODE> field with the untranslated
 | 
						|
string, and leaves the <CODE>msgstr</CODE> string to be empty.  Such entries,
 | 
						|
having an empty translation, are said to be <STRONG>untranslated</STRONG> entries.
 | 
						|
Later, when the programmer slightly modifies some string right in
 | 
						|
the program, this change is later reflected in the PO file
 | 
						|
by the appearance of a new untranslated entry for the modified string.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The usual commands moving from entry to entry consider untranslated
 | 
						|
entries on the same level as active entries.  Untranslated entries
 | 
						|
are easily recognizable by the fact they end with <SAMP>`msgstr ""'</SAMP>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The work of the translator might be (quite naively) seen as the process
 | 
						|
of seeking after an untranslated entry, editing a translation for
 | 
						|
it, and repeating these actions until no untranslated entries remain.
 | 
						|
Some commands are more specifically related to untranslated entry
 | 
						|
processing.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>e</KBD>
 | 
						|
<DD>
 | 
						|
Find the next untranslated entry.
 | 
						|
 | 
						|
<DT><KBD>M-e</KBD>
 | 
						|
<DD>
 | 
						|
Find the previous untranslated entry.
 | 
						|
 | 
						|
<DT><KBD>k</KBD>
 | 
						|
<DD>
 | 
						|
Turn the current entry into an untranslated one.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
The commands <KBD>e</KBD> (<CODE>po-next-empty-entry</CODE>) and <KBD>M-e</KBD>
 | 
						|
(<CODE>po-previous-empty</CODE>) move forwards or backwards, chasing for an
 | 
						|
obsolete entry.  If none is found, the search is extended and wraps
 | 
						|
around in the PO file buffer.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
An entry can be turned back into an untranslated entry by
 | 
						|
merely emptying its translation, using the command <KBD>k</KBD>
 | 
						|
(<CODE>po-kill-msgstr</CODE>).  See section <A HREF="gettext.html#SEC26">Modifying Translations</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Also, when time comes to quit working on a PO file buffer
 | 
						|
with the <KBD>q</KBD> command, the translator is asked for confirmation,
 | 
						|
if some untranslated string still exists.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC25" HREF="gettext_toc.html#TOC25">Obsolete Entries</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
By <STRONG>obsolete</STRONG> PO file entries, we mean those entries which are
 | 
						|
commented out, usually by <CODE>tupdate</CODE> when it found that the
 | 
						|
translation is not needed anymore by the package being localized.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The usual commands moving from entry to entry consider obsolete
 | 
						|
entries on the same level as active entries.  Obsolete entries are
 | 
						|
easily recognizable by the fact that all their lines start with
 | 
						|
<KBD>#</KBD>, even those lines containing <CODE>msgid</CODE> or <CODE>msgstr</CODE>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Commands exist for emptying the translation or reinitializing it
 | 
						|
to the original untranslated string.  Commands interfacing with the
 | 
						|
kill ring may force some previously saved text into the translation.
 | 
						|
The user may interactively edit the translation.  All these commands
 | 
						|
may apply to obsolete entries, carefully leaving the entry obsolete
 | 
						|
after the fact.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Moreover, some commands are more specifically related to obsolete
 | 
						|
entry processing.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>M-n</KBD>
 | 
						|
<DD>
 | 
						|
<DT><KBD>M-<KBD>SPC</KBD></KBD>
 | 
						|
<DD>
 | 
						|
Find the next obsolete entry.
 | 
						|
 | 
						|
<DT><KBD>M-p</KBD>
 | 
						|
<DD>
 | 
						|
<DT><KBD>M-<KBD>DEL</KBD></KBD>
 | 
						|
<DD>
 | 
						|
Find the previous obsolete entry.
 | 
						|
 | 
						|
<DT><KBD>z</KBD>
 | 
						|
<DD>
 | 
						|
Make an active entry obsolete, or zap out an obsolete entry.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
The commands <KBD>M-n</KBD> (<CODE>po-next-obsolete-entry</CODE>) and <KBD>M-p</KBD>
 | 
						|
(<CODE>po-previous-obsolete-entry</CODE>) move forwards or backwards,
 | 
						|
chasing for an obsolete entry.  If none is found, the search is
 | 
						|
extended and wraps around in the PO file buffer.  The commands
 | 
						|
<KBD>M-<KBD>SPC</KBD></KBD> and <KBD>M-<KBD>DEL</KBD></KBD> are synonymous to <KBD>M-n</KBD>
 | 
						|
and <KBD>M-p</KBD>, respectively.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
PO mode does not provide ways for un-commenting an obsolete entry
 | 
						|
and making it active, because this would reintroduce an original
 | 
						|
untranslated string which does not correspond to any marked string
 | 
						|
in the program sources.  This goes with the philosophy of never
 | 
						|
introducing useless <CODE>msgid</CODE> values.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
However, it is possible to comment out an active entry, so making
 | 
						|
it obsolete.  GNU <CODE>gettext</CODE> utilities will later react to the
 | 
						|
disappearance of a translation by using the untranslated string.
 | 
						|
The command <KBD>z</KBD> (<CODE>po-fade-out-entry</CODE>) pushes the current entry
 | 
						|
a little further towards annihilation.  If the entry is active, then
 | 
						|
the entry is merely commented out.  If the entry is already obsolete,
 | 
						|
then it is completely deleted from the PO file.  It is easy to recycle
 | 
						|
the translation so deleted into some other PO file entry, usually
 | 
						|
one which is untranslated.  See section <A HREF="gettext.html#SEC26">Modifying Translations</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Here is a quite interesting problem to solve for later development of
 | 
						|
PO mode, for those nights you are not sleepy.  The idea would be that
 | 
						|
PO mode might become bright enough, one of these days, to make good
 | 
						|
guesses at retrieving the most probable candidate, among all obsolete
 | 
						|
entries, for initializing the translation of a newly appeared string.
 | 
						|
I think it might be a quite hard problem to do this algorithmically, as
 | 
						|
we have to develop good and efficient measures of string similarity.
 | 
						|
Right now, PO mode completely lets the decision to the translator,
 | 
						|
when the time comes to find the adequate obsolete translation, it
 | 
						|
merely tries to provide handy tools for helping her to do so.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC26" HREF="gettext_toc.html#TOC26">Modifying Translations</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
PO mode prevents direct edition of the PO file, by the usual
 | 
						|
means Emacs give for altering a buffer's contents.  By doing so,
 | 
						|
it pretends helping the translator to avoid little clerical errors
 | 
						|
about the overall file format, or the proper quoting of strings,
 | 
						|
as those errors would be easily made.  Other kinds of errors are
 | 
						|
still possible, but some may be catched and diagnosed by the batch
 | 
						|
validation process, which the translator may always trigger by the
 | 
						|
<KBD>v</KBD> command.  For all other errors, the translator has to rely on
 | 
						|
her own judgment, and also on the linguistic reports submitted to her
 | 
						|
by the users of the translated package, having the same mother tongue.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When the time comes to create a translation, correct a error diagnosed
 | 
						|
mechanically or reported by a user, the translator have to resort to
 | 
						|
using the following commands for modifying the translations.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>RET</KBD>
 | 
						|
<DD>
 | 
						|
Interactively edit the translation.
 | 
						|
 | 
						|
<DT><KBD>TAB</KBD>
 | 
						|
<DD>
 | 
						|
Reinitialize the translation with the original, untranslated string.
 | 
						|
 | 
						|
<DT><KBD>k</KBD>
 | 
						|
<DD>
 | 
						|
Save the translation on the kill ring, and delete it.
 | 
						|
 | 
						|
<DT><KBD>w</KBD>
 | 
						|
<DD>
 | 
						|
Save the translation on the kill ring, without deleting it.
 | 
						|
 | 
						|
<DT><KBD>y</KBD>
 | 
						|
<DD>
 | 
						|
Replace the translation, taking the new from the kill ring.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
The command <KBD>RET</KBD> (<CODE>po-edit-msgstr</CODE>) opens a new Emacs
 | 
						|
window containing a copy of the translation taken from the current
 | 
						|
PO file entry, all ready for edition, fully modifiable
 | 
						|
and with the complete extent of GNU Emacs modifying commands.
 | 
						|
The string is presented to the translator expunged of all quoting
 | 
						|
marks, and she will modify the <EM>unquoted</EM> string in this
 | 
						|
window to heart's content.  Once done, the regular Emacs command
 | 
						|
<KBD>M-C-c</KBD> (<CODE>exit-recursive-edit</CODE>) may be used to return the
 | 
						|
edited translation into the PO file, replacing the original
 | 
						|
translation.  The keys <KBD>C-c C-c</KBD> are bound so they have the
 | 
						|
same effect as <KBD>M-C-c</KBD>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If the translator becomes unsatisfied with her translation to the
 | 
						|
extent she prefers keeping the translation which was existent prior to
 | 
						|
the <KBD>RET</KBD> command, she may use the regular Emacs command <KBD>C-]</KBD>
 | 
						|
(<CODE>abort-recursive-edit</CODE>) to merely get rid of edition, while
 | 
						|
preserving the original translation.  Another way would be for her
 | 
						|
to exit normally with <KBD>C-c C-c</KBD>, then type <CODE>u</CODE> once for
 | 
						|
undoing the whole effect of last edition.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
While editing her translation, the translator should pay attention at
 | 
						|
not inserting unwanted <KBD><KBD>RET</KBD></KBD> (carriage returns) characters at
 | 
						|
the end of the translated string if those are not meant to be there,
 | 
						|
or removing such characters when they are required.  Since these
 | 
						|
characters are not visible in the editing buffer, they are easily to
 | 
						|
introduce by mistake.  To help her, <KBD><KBD>RET</KBD></KBD> automatically puts
 | 
						|
the character <KBD><</KBD> at the end of the string being edited, but this
 | 
						|
<KBD><</KBD> is not really part of the string.  On exiting the editing
 | 
						|
window with <KBD>C-c C-c</KBD>, PO mode automatically removes such
 | 
						|
<KBD><</KBD> and all whitespace added after it.  If the translator adds
 | 
						|
characters after the terminating <KBD><</KBD>, it looses its delimiting
 | 
						|
property and integrally becomes part of the string.  If she removes
 | 
						|
the delimiting <KBD><</KBD>, then the edited string is taken <EM>as
 | 
						|
is</EM>, with all trailing newlines, even if invisible.  Also, if the
 | 
						|
translated string ought to end itself with a genuine <KBD><</KBD>, then the
 | 
						|
delimiting <KBD><</KBD> may not be removed; so the string should appear,
 | 
						|
in the editing window, as ending with two <KBD><</KBD> in a row.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When a translation (or a comment) is being edited, the translator
 | 
						|
may move the cursor back into the PO file buffer and freely
 | 
						|
move to other entries, and browsing at will.  The edited entry will
 | 
						|
be recovered as soon as the edit ceases, because this is this entry
 | 
						|
only which is being modified.  If, with an edition still opened, the
 | 
						|
translator wanders in the PO file buffer, she cannot modify
 | 
						|
any other entry.  If she tries to, PO mode will react by suggesting
 | 
						|
that she aborts the current edit, or else, by inviting her to finish
 | 
						|
the current edit prior to any other modification.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>TAB</KBD> (<CODE>po-msgid-to-msgstr</CODE>) initializes, or
 | 
						|
reinitializes the translation with the original string.  This command
 | 
						|
is normally used when the translator wants to redo a fresh translation
 | 
						|
of the original string, disregarding any previous work.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
In fact, whether it is best to start a translation with an empty
 | 
						|
string, or rather with a copy of the original string, is a matter of
 | 
						|
taste or habit.  Sometimes, the source mother tongue language and the
 | 
						|
target language are so different that is simply best to start writing
 | 
						|
on an empty page.  At other times, the source and target languages
 | 
						|
are so close that it would be a waste to retype a number of words
 | 
						|
already being written in the original string.  A translator may also
 | 
						|
like having the original string right under her eyes, as she will
 | 
						|
progressively overwrite the original text with the translation, even
 | 
						|
if this requires some extra editing work to get rid of the original.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>k</KBD> (<CODE>po-kill-msgstr</CODE>) merely empties the
 | 
						|
translation string, so turning the entry into an untranslated
 | 
						|
one.  But while doing so, its previous contents is put apart in
 | 
						|
a special place, known as the kill ring.  The command <KBD>w</KBD>
 | 
						|
(<CODE>po-kill-ring-save-msgstr</CODE>) has also the effect of taking a
 | 
						|
copy of the translation onto the kill ring, but it otherwise leaves
 | 
						|
the entry alone, and does <EM>not</EM> remove the translation from the
 | 
						|
entry.  Both commands use exactly the Emacs kill ring, which is shared
 | 
						|
between buffers, and which is well known already to GNU Emacs lovers.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The translator may use <KBD>k</KBD> or <KBD>w</KBD> many times in the course
 | 
						|
of her work, as the kill ring may hold several saved translations.
 | 
						|
From the kill ring, strings may later be reinserted in various
 | 
						|
Emacs buffers.  In particular, the kill ring may be used for moving
 | 
						|
translation strings between different entries of a single PO file
 | 
						|
buffer, or if the translator is handling many such buffers at once,
 | 
						|
even between PO files.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
To facilitate exchanges with buffers which are not in PO mode, the
 | 
						|
translation string put on the kill ring by the <KBD>k</KBD> command is fully
 | 
						|
unquoted before being saved: external quotes are removed, multi-lines
 | 
						|
strings are concatenated, and backslashed escaped sequences are turned
 | 
						|
into their corresponding characters.  In the special case of obsolete
 | 
						|
entries, the translation is also uncommented prior to saving.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>y</KBD> (<CODE>po-yank-msgstr</CODE>) completely replaces the
 | 
						|
translation of the current entry by a string taken from the kill ring.
 | 
						|
Following GNU Emacs terminology, we then say that the replacement
 | 
						|
string is <STRONG>yanked</STRONG> into the PO file buffer.
 | 
						|
See section `Yanking' in <CITE>The Emacs Editor</CITE>.
 | 
						|
The first time <KBD>y</KBD> is used, the translation receives the value of
 | 
						|
the most recent addition to the kill ring.  If <KBD>y</KBD> is typed once
 | 
						|
again, immediately, without intervening keystrokes, the translation
 | 
						|
just inserted is taken away and replaced by the second most recent
 | 
						|
addition to the kill ring.  By repeating <KBD>y</KBD> many times in a row,
 | 
						|
the translator may travel along the kill ring for saved strings,
 | 
						|
until she finds the string she really wanted.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When a string is yanked into a PO file entry, it is fully and
 | 
						|
automatically requoted for complying with the format PO files should
 | 
						|
have.  Further, if the entry is obsolete, PO mode then appropriately
 | 
						|
push the inserted string inside comments.  Once again, translators
 | 
						|
should not burden themselves with quoting considerations besides, of
 | 
						|
course, the necessity of the translated string itself respective to
 | 
						|
the program using it.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Note that <KBD>k</KBD> or <KBD>w</KBD> are not the only commands pushing strings
 | 
						|
on the kill ring, as almost any PO mode command replacing translation
 | 
						|
strings (or the translator comments) automatically save the old string
 | 
						|
on the kill ring.  The main exceptions to this general rule are the
 | 
						|
yanking commands themselves.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
To better illustrate the operation of killing and yanking, let's
 | 
						|
use an actual example, taken from a common situation.  When the
 | 
						|
programmer slightly modifies some string right in the program, his
 | 
						|
change is later reflected in the PO file by the appearance
 | 
						|
of a new untranslated entry for the modified string, and the fact
 | 
						|
that the entry translating the original or unmodified string becomes
 | 
						|
obsolete.  In many cases, the translator might spare herself some work
 | 
						|
by retrieving the unmodified translation from the obsolete entry,
 | 
						|
then initializing the untranslated entry <CODE>msgstr</CODE> field with
 | 
						|
this retrieved translation.  Once this done, the obsolete entry is
 | 
						|
not wanted anymore, and may be safely deleted.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When the translator finds an untranslated entry and suspects that a
 | 
						|
slight variant of the translation exists, she immediately uses <KBD>m</KBD>
 | 
						|
to mark the current entry location, then starts chasing obsolete
 | 
						|
entries with <KBD>M-SPC</KBD>, hoping to find some translation corresponding
 | 
						|
to the unmodified string.  Once found, she uses the <KBD>z</KBD> command
 | 
						|
for deleting the obsolete entry, knowing that <KBD>z</KBD> also <EM>kills</EM>
 | 
						|
the translation, that is, pushes the translation on the kill ring.
 | 
						|
Then, <KBD>l</KBD> returns to the initial untranslated entry, <KBD>y</KBD>
 | 
						|
then <EM>yanks</EM> the saved translation right into the <CODE>msgstr</CODE>
 | 
						|
field.  The translator is then free to use <KBD><KBD>RET</KBD></KBD> for fine
 | 
						|
tuning the translation contents, and maybe to later use <KBD>e</KBD>,
 | 
						|
then <KBD>m</KBD> again, for going on with the next untranslated string.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When some sequence of keys has to be typed over and over again, the
 | 
						|
translator may find comfortable to become more acquainted with the GNU
 | 
						|
Emacs capability of learning these sequences and playing them back under
 | 
						|
request.  See section `Keyboard Macros' in <CITE>The Emacs Editor</CITE>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC27" HREF="gettext_toc.html#TOC27">Modifying Comments</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Any translation work done seriously will raise many linguistic
 | 
						|
difficulties, for which decisions have to be made, and the choices
 | 
						|
further documented.  These documents may be saved within the
 | 
						|
PO file in form of translator comments, which the translator
 | 
						|
is free to create, delete, or modify at will.  These comments may
 | 
						|
be useful to herself when she returns to this PO file after a while.
 | 
						|
Memory forgets!
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
These commands are somewhat similar to those modifying translations,
 | 
						|
so the general indications given for these apply here.  See section <A HREF="gettext.html#SEC26">Modifying Translations</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><KBD>M-RET</KBD>
 | 
						|
<DD>
 | 
						|
Interactively edit the translator comments.
 | 
						|
 | 
						|
<DT><KBD>M-k</KBD>
 | 
						|
<DD>
 | 
						|
Save the translator comments on the kill ring, and delete it.
 | 
						|
 | 
						|
<DT><KBD>M-w</KBD>
 | 
						|
<DD>
 | 
						|
Save the translator comments on the kill ring, without deleting it.
 | 
						|
 | 
						|
<DT><KBD>M-y</KBD>
 | 
						|
<DD>
 | 
						|
Replace the translator comments, taking the new from the kill ring.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
Those commands parallel PO mode commands for modifying the translation
 | 
						|
strings, and behave much the same way as them, except that they handle
 | 
						|
this part of PO file comments meant for translator usage, rather
 | 
						|
than the translation strings.  So, the descriptions given below are
 | 
						|
slightly succinct, because the full details have already been given.
 | 
						|
See section <A HREF="gettext.html#SEC26">Modifying Translations</A>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>M-RET</KBD> (<CODE>po-edit-comment</CODE>) opens a new Emacs
 | 
						|
window containing a copy of the translator comments the current
 | 
						|
PO file entry.  If there is no such comments, PO mode
 | 
						|
understands that the translator wants to add a comment to the entry,
 | 
						|
and she is presented an empty screen.  Comment marks (<KBD>#</KBD>) and
 | 
						|
the space following them are automatically removed before edition,
 | 
						|
and reinstated after.  For translator comments pertaining to obsolete
 | 
						|
entries, the uncommenting and recommenting operations are done twice.
 | 
						|
The command <KBD>#</KBD> also has the same effect as <KBD>M-RET</KBD>, and might
 | 
						|
be easier to type.  Once in the editing window, the keys <KBD>C-c
 | 
						|
C-c</KBD> allow the translator to tell she is finished with editing
 | 
						|
the comment.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The command <KBD>M-k</KBD> (<CODE>po-kill-comment</CODE>) get rid of all
 | 
						|
translator comments, while saving those comments on the kill ring.
 | 
						|
The command <KBD>M-w</KBD> (<CODE>po-kill-ring-save-comment</CODE>) takes
 | 
						|
a copy of the translator comments on the kill ring, but leaves
 | 
						|
them undisturbed in the current entry.  The command <KBD>M-y</KBD>
 | 
						|
(<CODE>po-yank-comment</CODE>) completely replaces the translator comments
 | 
						|
by a string taken at the front of the kill ring.  When this command
 | 
						|
is immediately repeated, the comments just inserted are withdrawn,
 | 
						|
and replaced by other strings taken along the kill ring.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
On the kill ring, all strings have the same nature.  There is no
 | 
						|
distinction between <EM>translation</EM> strings and <EM>translator
 | 
						|
comments</EM> strings.  So, for example, let's presume the translator
 | 
						|
has just finished editing a translation, and wants to create a new
 | 
						|
translator comments for documenting why the previous translation was
 | 
						|
not good, just to remember what was the problem.  Foreseeing that she
 | 
						|
will do that in her documentation, the translator will want to quote
 | 
						|
the previous translation in her translator comments.  For doing so, she
 | 
						|
may initialize the translator comments with the previous translation,
 | 
						|
still at the head of the kill ring.  Because editing already pushed the
 | 
						|
previous translation on the kill ring, she just has to type <KBD>M-w</KBD>
 | 
						|
prior to <KBD>#</KBD>, and the previous translation will be right there,
 | 
						|
all ready for being introduced by some explanatory text.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
On the other hand, presume there are some translator comments already
 | 
						|
and that the translator wants to add to those comments, instead
 | 
						|
of wholly replacing them.  Then, she should edit the comment right
 | 
						|
away with <KBD>#</KBD>.  Once inside the editing window, she can use the
 | 
						|
regular GNU Emacs commands <KBD>C-y</KBD> (<CODE>yank</CODE>) and <KBD>M-y</KBD>
 | 
						|
(<CODE>yank-pop</CODE>) for getting the previous translation where she likes.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC28" HREF="gettext_toc.html#TOC28">Consulting Auxiliary PO Files</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
An incoming feature of PO mode should help the knowledgeable translator
 | 
						|
to take advantage of translations already achieved in other languages
 | 
						|
she just happens to know, by providing these other language translation
 | 
						|
as additional context for her own work.  Each PO file existing for
 | 
						|
the same package the translator is working on, but targeted to a
 | 
						|
different mother tongue language, is called an <STRONG>auxiliary</STRONG> PO file.
 | 
						|
Commands will exist for declaring and handling auxiliary PO files,
 | 
						|
and also for showing contexts for the entry under work.  For this to
 | 
						|
work fully, all auxiliary PO files will have to be normalized.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC29" HREF="gettext_toc.html#TOC29">Producing Binary MO Files</A></H1>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC30" HREF="gettext_toc.html#TOC30">Invoking the <CODE>msgfmt</CODE> Program</A></H2>
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
Usage: msgfmt [<VAR>option</VAR>] <VAR>filename</VAR>.po ...
 | 
						|
</PRE>
 | 
						|
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><SAMP>`-a <VAR>number</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--alignment=<VAR>number</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
Align strings to <VAR>number</VAR> bytes (default: 1).
 | 
						|
 | 
						|
<DT><SAMP>`-h'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--help'</SAMP>
 | 
						|
<DD>
 | 
						|
Display this help and exit.
 | 
						|
 | 
						|
<DT><SAMP>`-I <VAR>list</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--input-path=<VAR>list</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
List of directories searched for input files.
 | 
						|
 | 
						|
<DT><SAMP>`--no-hash'</SAMP>
 | 
						|
<DD>
 | 
						|
Binary file will not include the hash table.
 | 
						|
 | 
						|
<DT><SAMP>`-o <VAR>file</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--output-file=<VAR>file</VAR>'</SAMP>
 | 
						|
<DD>
 | 
						|
Specify output file name as <VAR>file</VAR>.
 | 
						|
 | 
						|
<DT><SAMP>`-v'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--verbose'</SAMP>
 | 
						|
<DD>
 | 
						|
Detect and diagnose input file anomalies which might represent
 | 
						|
translation errors.  The <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings are
 | 
						|
studied and compared.  It is considered abnormal that one string
 | 
						|
starts or ends with a newline while the other does not.  Also, both
 | 
						|
strings should have the same number of <SAMP>`%'</SAMP> format specifiers,
 | 
						|
with matching types.  For example, the check will diagnose using
 | 
						|
<SAMP>`%.*s'</SAMP> against <SAMP>`%s'</SAMP>, or <SAMP>`%d'</SAMP> against <SAMP>`%s'</SAMP>, or
 | 
						|
<SAMP>`%d'</SAMP> against <SAMP>`%x'</SAMP>.  It can even handle positional parameters.
 | 
						|
 | 
						|
<DT><SAMP>`-V'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--version'</SAMP>
 | 
						|
<DD>
 | 
						|
Output version information and exit.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
If input file is <SAMP>`-'</SAMP>, standard input is read.  If output file
 | 
						|
is <SAMP>`-'</SAMP>, output is written to standard output.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The search patch for <CODE>msgfmt</CODE> is <TT>`/usr/local/share/nls/src/'</TT>,
 | 
						|
by default.  It represents the path to additional directories where
 | 
						|
other PO files can be found.  This feature could be used for some
 | 
						|
PO files for standard libraries, in case we would like to spare
 | 
						|
translating their strings over and over again.  The <SAMP>`-x'</SAMP> option
 | 
						|
could then exclude these strings from the generation.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC31" HREF="gettext_toc.html#TOC31">The Format of GNU MO Files</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The format of the generated MO files is best described by a picture,
 | 
						|
which appears below.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The first two words serve the identification of the file.  The magic
 | 
						|
number will always signal GNU MO files.  The number is stored in the
 | 
						|
byte order of the generating machine, so the magic number really is
 | 
						|
two numbers: <CODE>0x950412de</CODE> and <CODE>0xde120495</CODE>.  The second
 | 
						|
word describes the current revision of the file format.  For now the
 | 
						|
revision is 0.  This might change in future versions, and ensures
 | 
						|
that the readers of MO files can distinguish new formats from old
 | 
						|
ones, so that both can be handled correctly.  The version is kept
 | 
						|
separate from the magic number, instead of using different magic
 | 
						|
numbers for different formats, mainly because <TT>`/etc/magic'</TT> is
 | 
						|
not updated often.  It might be better to have magic separated from
 | 
						|
internal format version identification.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Follow a number of pointers to later tables in the file, allowing
 | 
						|
for the extension of the prefix part of MO files without having to
 | 
						|
recompile programs reading them.  This might become useful for later
 | 
						|
inserting a few flag bits, indication about the charset used, new
 | 
						|
tables, or other things.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Then, at offset <VAR>O</VAR> and offset <VAR>T</VAR> in the picture, two tables
 | 
						|
of string descriptors can be found.  In both tables, each string
 | 
						|
descriptor uses two 32 bits integers, one for the string length,
 | 
						|
another for the offset of the string in the MO file, counting in bytes
 | 
						|
from the start of the file.  The first table contains descriptors
 | 
						|
for the original strings, and is sorted so the original strings
 | 
						|
are in increasing lexicographical order.  The second table contains
 | 
						|
descriptors for the translated strings, and is parallel to the first
 | 
						|
table: to find the corresponding translation one has to access the
 | 
						|
array slot in the second array with the same index.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Having the original strings sorted enables the use of simple binary
 | 
						|
search, for when the MO file does not contain an hashing table, or
 | 
						|
for when it is not practical to use the hashing table provided in
 | 
						|
the MO file.  This also has another advantage, as the empty string
 | 
						|
in a PO file GNU <CODE>gettext</CODE> is usually <EM>translated</EM> into
 | 
						|
some system information attached to that particular MO file, and the
 | 
						|
empty string necessarily becomes the first in both the original and
 | 
						|
translated tables, making the system information very easy to find.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The size <VAR>S</VAR> of the hash table can be zero.  In this case, the
 | 
						|
hash table itself is not contained in the MO file.  Some people might
 | 
						|
prefer this because a precomputed hashing table takes disk space, and
 | 
						|
does not win <EM>that</EM> much speed.  The hash table contains indices
 | 
						|
to the sorted array of strings in the MO file.  Conflict resolution is
 | 
						|
done by double hashing.  The precise hashing algorithm used is fairly
 | 
						|
dependent of GNU <CODE>gettext</CODE> code, and is not documented here.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
As for the strings themselves, they follow the hash file, and each
 | 
						|
is terminated with a <KBD>NUL</KBD>, and this <KBD>NUL</KBD> is not counted in
 | 
						|
the length which appears in the string descriptor.  The <CODE>msgfmt</CODE>
 | 
						|
program has an option selecting the alignment for MO file strings.
 | 
						|
With this option, each string is separately aligned so it starts at
 | 
						|
an offset which is a multiple of the alignment value.  On some RISC
 | 
						|
machines, a correct alignment will speed things up.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Nothing prevents an MO file from having embedded <KBD>NUL</KBD>s in strings.
 | 
						|
However, the program interface currently used already presumes
 | 
						|
that strings are <KBD>NUL</KBD> terminated, so embedded <KBD>NUL</KBD>s are
 | 
						|
somewhat useless.  But MO file format is general enough so other
 | 
						|
interfaces would be later possible, if for example, we ever want to
 | 
						|
implement wide characters right in MO files, where <KBD>NUL</KBD> bytes may
 | 
						|
accidently appear.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This particular issue has been strongly debated in the GNU
 | 
						|
<CODE>gettext</CODE> development forum, and it is expectable that MO file
 | 
						|
format will evolve or change over time.  It is even possible that many
 | 
						|
formats may later be supported concurrently.  But surely, we got to
 | 
						|
start somewhere, and the MO file format described here is a good start.
 | 
						|
Nothing is cast in concrete, and the format may later evolve fairly
 | 
						|
easily, so we should feel comfortable with the current approach.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
        byte
 | 
						|
             +------------------------------------------+
 | 
						|
          0  | magic number = 0x950412de                |
 | 
						|
             |                                          |
 | 
						|
          4  | file format revision = 0                 |
 | 
						|
             |                                          |
 | 
						|
          8  | number of strings                        |  == N
 | 
						|
             |                                          |
 | 
						|
         12  | offset of table with original strings    |  == O
 | 
						|
             |                                          |
 | 
						|
         16  | offset of table with translation strings |  == T
 | 
						|
             |                                          |
 | 
						|
         20  | size of hashing table                    |  == S
 | 
						|
             |                                          |
 | 
						|
         24  | offset of hashing table                  |  == H
 | 
						|
             |                                          |
 | 
						|
             .                                          .
 | 
						|
             .    (possibly more entries later)         .
 | 
						|
             .                                          .
 | 
						|
             |                                          |
 | 
						|
          O  | length & offset 0th string  ----------------.
 | 
						|
      O + 8  | length & offset 1st string  ------------------.
 | 
						|
              ...                                    ...   | |
 | 
						|
O + ((N-1)*8)| length & offset (N-1)th string           |  | |
 | 
						|
             |                                          |  | |
 | 
						|
          T  | length & offset 0th translation  ---------------.
 | 
						|
      T + 8  | length & offset 1st translation  -----------------.
 | 
						|
              ...                                    ...   | | | |
 | 
						|
T + ((N-1)*8)| length & offset (N-1)th translation      |  | | | |
 | 
						|
             |                                          |  | | | |
 | 
						|
          H  | start hash table                         |  | | | |
 | 
						|
              ...                                    ...   | | | |
 | 
						|
  H + S * 4  | end hash table                           |  | | | |
 | 
						|
             |                                          |  | | | |
 | 
						|
             | NUL terminated 0th string  <----------------' | | |
 | 
						|
             |                                          |    | | |
 | 
						|
             | NUL terminated 1st string  <------------------' | |
 | 
						|
             |                                          |      | |
 | 
						|
              ...                                    ...       | |
 | 
						|
             |                                          |      | |
 | 
						|
             | NUL terminated 0th translation  <---------------' |
 | 
						|
             |                                          |        |
 | 
						|
             | NUL terminated 1st translation  <-----------------'
 | 
						|
             |                                          |
 | 
						|
              ...                                    ...
 | 
						|
             |                                          |
 | 
						|
             +------------------------------------------+
 | 
						|
</PRE>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC32" HREF="gettext_toc.html#TOC32">The User's View</A></H1>
 | 
						|
 | 
						|
<P>
 | 
						|
When GNU <CODE>gettext</CODE> will truly have reached is goal, average users
 | 
						|
should feel some kind of astonished pleasure, seeing the effect of
 | 
						|
that strange kind of magic that just makes their own native language
 | 
						|
appear everywhere on their screens.  As for naive users, they would
 | 
						|
ideally have no special pleasure about it, merely taking their own
 | 
						|
language for <EM>granted</EM>, and becoming rather unhappy otherwise.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
So, let's try to describe here how we would like the magic to operate,
 | 
						|
as we want the users' view to be the simplest, among all ways one
 | 
						|
could look at GNU <CODE>gettext</CODE>.  All other software engineers:
 | 
						|
programmers, translators, maintainers, should work together in such a
 | 
						|
way that the magic becomes possible.  This is a long and progressive
 | 
						|
undertaking, and information is available about the progress of the
 | 
						|
GNU Translation Project.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When a package is distributed, there are two kind of users:
 | 
						|
<STRONG>installers</STRONG> who fetch the distribution, unpack it, configure
 | 
						|
it, compile it and install it for themselves or others to use; and
 | 
						|
<STRONG>end users</STRONG> that call programs of the package, once these have
 | 
						|
been installed at their site.  GNU <CODE>gettext</CODE> is offering magic
 | 
						|
for both installers and end users.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC33" HREF="gettext_toc.html#TOC33">The Current <TT>`NLS'</TT> Matrix for GNU</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Languages are not equally supported in all GNU packages.  To know
 | 
						|
if some GNU package uses GNU <CODE>gettext</CODE>, one may check
 | 
						|
the distribution for the <TT>`NLS'</TT> information file, for some
 | 
						|
<TT>`<VAR>ll</VAR>.po'</TT> files, often kept together into some <TT>`po/'</TT>
 | 
						|
directory, or for an <TT>`intl/'</TT> directory.  Internationalized
 | 
						|
packages have usually many <TT>`<VAR>ll</VAR>.po'</TT> files, where <VAR>ll</VAR>
 | 
						|
represents the language.  section <A HREF="gettext.html#SEC35">Magic for End Users</A> for a complete description
 | 
						|
of the format for <VAR>ll</VAR>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
More generally, a matrix is available for showing the current state
 | 
						|
of GNU internationalization, listing which packages are prepared
 | 
						|
for multi-lingual messages, and which languages is supported by each.
 | 
						|
Because this information changes often, this matrix is not kept within
 | 
						|
this GNU <CODE>gettext</CODE> manual.  This information is often found in
 | 
						|
file <TT>`NLS'</TT> from various GNU distributions, but is also as old
 | 
						|
as the distribution itself.  A recent copy of this <TT>`NLS'</TT> file,
 | 
						|
containing up-to-date information, should generally be found on most
 | 
						|
GNU archive sites.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC34" HREF="gettext_toc.html#TOC34">Magic for Installers</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
By default, packages fully using GNU <CODE>gettext</CODE>, internally,
 | 
						|
are installed in such a way that they to allow translation of
 | 
						|
messages.  At <EM>configuration</EM> time, those packages should
 | 
						|
automatically detect whether the underlying host system provides usable
 | 
						|
<CODE>catgets</CODE> or <CODE>gettext</CODE> functions.  If neither is present,
 | 
						|
the GNU <CODE>gettext</CODE> library should be automatically prepared
 | 
						|
and used.  Installers may use special options at configuration
 | 
						|
time for changing this behavior.  The command <SAMP>`./configure
 | 
						|
--with-gnu-gettext'</SAMP> bypasses system <CODE>catgets</CODE> or <CODE>gettext</CODE> to
 | 
						|
use GNU <CODE>gettext</CODE> instead, while <SAMP>`./configure --disable-nls'</SAMP>
 | 
						|
produces program totally unable to translate messages.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Internationalized packages have usually many <TT>`<VAR>ll</VAR>.po'</TT>
 | 
						|
files.  Unless
 | 
						|
translations are disabled, all those available are installed together
 | 
						|
with the package.  However, the environment variable <CODE>LINGUAS</CODE>
 | 
						|
may be set, prior to configuration, to limit the installed set.
 | 
						|
<CODE>LINGUAS</CODE> should then contain a space separated list of two-letter
 | 
						|
codes, stating which languages are allowed.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC35" HREF="gettext_toc.html#TOC35">Magic for End Users</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
We consider here those packages using GNU <CODE>gettext</CODE> internally,
 | 
						|
and for which the installers did not disable translation at
 | 
						|
<EM>configure</EM> time.  Then, users only have to set the <CODE>LANG</CODE>
 | 
						|
environment variable to the appropriate <SAMP>`<VAR>ll</VAR>'</SAMP> prior to
 | 
						|
using the programs in the package.  See section <A HREF="gettext.html#SEC33">The Current <TT>`NLS'</TT> Matrix for GNU</A>.  For example,
 | 
						|
let's presume a German site.  At the shell prompt, users merely have to
 | 
						|
execute <SAMP>`setenv LANG de'</SAMP> (in <CODE>csh</CODE>) or <SAMP>`export
 | 
						|
LANG; LANG=de'</SAMP> (in <CODE>sh</CODE>).  They could even do this from their
 | 
						|
<TT>`.login'</TT> or <TT>`.profile'</TT> file.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC36" HREF="gettext_toc.html#TOC36">The Programmer's View</A></H1>
 | 
						|
 | 
						|
<P>
 | 
						|
One aim of the current message catalog implementation provided by
 | 
						|
GNU <CODE>gettext</CODE> was to use the systems message catalog handling, if the
 | 
						|
installer wishes to do so.  So we perhaps should first take a look at
 | 
						|
the solutions we know about.  The people in the POSIX committee does not
 | 
						|
manage to agree on one of the semi-official standards which we'll
 | 
						|
describe below.  In fact they couldn't agree on anything, so nothing
 | 
						|
decide only to include an example of an interface.  The major Unix vendors
 | 
						|
are split in the usage of the two most important specifications: X/Opens
 | 
						|
catgets vs. Uniforums gettext interface.  We'll describe them both and
 | 
						|
later explain our solution of this dilemma.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC37" HREF="gettext_toc.html#TOC37">About <CODE>catgets</CODE></A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The <CODE>catgets</CODE> implementation is defined in the X/Open Portability
 | 
						|
Guide, Volume 3, XSI Supplementary Definitions, Chapter 5.  But the
 | 
						|
process of creating this standard seemed to be too slow for some of
 | 
						|
the Unix vendors so they created their implementations on preliminary
 | 
						|
versions of the standard.  Of course this leads again to problems while
 | 
						|
writing platform independent programs: even the usage of <CODE>catgets</CODE>
 | 
						|
does not guarantee a unique interface.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Another, personal comment on this that only a bunch of committee members
 | 
						|
could have made this interface.  They never really tried to program
 | 
						|
using this interface.  It is a fast, memory-saving implementation, an
 | 
						|
user can happily live with it.  But programmers hate it (at least me and
 | 
						|
some others do...)
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
But we must not forget one point: after all the trouble with transfering
 | 
						|
the rights on Unix(tm) they at last came to X/Open, the very same who
 | 
						|
published this specifications.  This leads me to making the prediction
 | 
						|
that this interface will be in future Unix standards (e.g. Spec1170) and
 | 
						|
therefore part of all Unix implementation (implementations, which are
 | 
						|
<EM>allowed</EM> to wear this name).
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC38" HREF="gettext_toc.html#TOC38">The Interface</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
The interface to the <CODE>catgets</CODE> implementation consists of three
 | 
						|
functions which correspond to those used in file access: <CODE>catopen</CODE>
 | 
						|
to open the catalog for using, <CODE>catgets</CODE> for accessing the message
 | 
						|
tables, and <CODE>catclose</CODE> for closing after work is done.  Prototypes
 | 
						|
for the functions and the needed definitions are in the
 | 
						|
<CODE><nl_types.h></CODE> header file.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
<CODE>catopen</CODE> is used like in this:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
nl_catd catd = catopen ("catalog_name", 0);
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The function takes as the argument the name of the catalog.  This usual
 | 
						|
refers to the name of the program or the package.  The second parameter
 | 
						|
is not further specified in the standard.  I don't even know whether it
 | 
						|
is implemented consistently among various systems.  So the common advice
 | 
						|
is to use <CODE>0</CODE> as the value.  The return value is a handle to the
 | 
						|
message catalog, equivalent to handles to file returned by <CODE>open</CODE>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This handle is of course used in the <CODE>catgets</CODE> function which can
 | 
						|
be used like this:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
char *translation = catgets (catd, set_no, msg_id, "original string");
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The first parameter is this catalog descriptor.  The second parameter
 | 
						|
specifies the set of messages in this catalog, in which the message
 | 
						|
described by <CODE>msg_id</CODE> is obtained.  <CODE>catgets</CODE> therefore uses a
 | 
						|
three-stage addressing:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
catalog name => set number => message ID => translation
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The fourth argument is not used to address the translation.  It is given
 | 
						|
as a default value in case when one of the addressing stages fail.  One
 | 
						|
important thing to remember is that although the return type of catgets
 | 
						|
is <CODE>char *</CODE> the resulting string <EM>must not</EM> be changed.  It
 | 
						|
should better <CODE>const char *</CODE>, but the standard is published in
 | 
						|
1988, one year before ANSI C.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The last of these function functions is used and behaves as expected:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
catclose (catd);
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
After this no <CODE>catgets</CODE> call using the descriptor is legal anymore.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC39" HREF="gettext_toc.html#TOC39">Problems with the <CODE>catgets</CODE> Interface?!</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
Now that this descriptions seemed to be really easy where are the
 | 
						|
problem we speak of.  In fact the interface could be used in a
 | 
						|
reasonable way, but constructing the message catalogs is a pain.  The
 | 
						|
reason for this lies in the third argument of <CODE>catgets</CODE>: the unique
 | 
						|
message ID.  This has to be a numeric value for all messages in a single
 | 
						|
set.  Perhaps you could imagine the problems keeping such list while
 | 
						|
changing the source code.  Add a new message here, remove one there.  Of
 | 
						|
course there have been developed a lot of tools helping to organize this
 | 
						|
chaos but one as the other fails in one aspect or the other.  We don't
 | 
						|
want to say that the other approach has no problems but they are far
 | 
						|
more easily to manage.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC40" HREF="gettext_toc.html#TOC40">About <CODE>gettext</CODE></A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The definition of the <CODE>gettext</CODE> interface comes from a Uniforum
 | 
						|
proposal and it is followed by at least one major Unix vendor
 | 
						|
(Sun) in its last developments.  It is not specified in any official
 | 
						|
standard, though.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The main points about this solution is that it does not follow the
 | 
						|
method of normal file handling (open-use-close) and that it does not
 | 
						|
burden the programmer so many task, especially the unique key handling.
 | 
						|
Of course here is also a unique key needed, but this key is the
 | 
						|
message itself (how long or short it is).  See section <A HREF="gettext.html#SEC45">Comparing the Two Interfaces</A> for a
 | 
						|
more detailed comparison of the two methods.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The following section contains a rather detailed description of the
 | 
						|
interface.  We make it that detailed because this is the interface
 | 
						|
we chose for the GNU <CODE>gettext</CODE> Library.  Programmers interested
 | 
						|
in using this library will be interested in this description.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC41" HREF="gettext_toc.html#TOC41">The Interface</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
The minimal functionality an interface must have is a) to select a
 | 
						|
domain the strings are coming from (a single domain for all programs is
 | 
						|
not reasonable because its construction and maintenance is difficult,
 | 
						|
perhaps impossible) and b) to access a string in a selected domain.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This is principally the description of the <CODE>gettext</CODE> interface.  It
 | 
						|
has an global domain which unqualified usages reference.  Of course this
 | 
						|
domain is selectable by the user.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
char *textdomain (const char *domain_name);
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
This provides the possibility to change or query the current status of
 | 
						|
the current global domain of the <CODE>LC_MESSAGE</CODE> category.  The
 | 
						|
argument is a null-terminated string, whose characters must be legal in
 | 
						|
the use in filenames.  If the <VAR>domain_name</VAR> argument is <CODE>NULL</CODE>,
 | 
						|
the function return the current value.  If no value has been set
 | 
						|
before, the name of the default domain is returned: <EM>messages</EM>.
 | 
						|
Please note that although the return value of <CODE>textdomain</CODE> is of
 | 
						|
type <CODE>char *</CODE> no changing is allowed.  It is also important to know
 | 
						|
that no checks of the availability are made.  If the name is not
 | 
						|
available you will see this by the fact that no translations are provided.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
To use a domain set by <CODE>textdomain</CODE> the function
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
char *gettext (const char *msgid);
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
is to be used.  This is the simplest reasonable form one can imagine.
 | 
						|
The translation of the string <VAR>msgid</VAR> is returned if it is available
 | 
						|
in the current domain.  If not available the argument itself is
 | 
						|
returned.  If the argument is <CODE>NULL</CODE> the result is undefined.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
One things which should come into mind is that no explicit dependency to
 | 
						|
the used domain is given.  The current value of the domain for the
 | 
						|
<CODE>LC_MESSAGES</CODE> locale is used.  If this changes between two
 | 
						|
executions of the same <CODE>gettext</CODE> call in the program, both calls
 | 
						|
reference a different message catalog.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
For the easiest case, which is normally used in internationalized GNU
 | 
						|
packages, once at the beginning of execution a call to <CODE>textdomain</CODE>
 | 
						|
is issued, setting the domain to a unique name, normally the package
 | 
						|
name.  In the following code all strings which have to be translated are
 | 
						|
filtered through the gettext function.  That's all, the package speaks
 | 
						|
your language.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC42" HREF="gettext_toc.html#TOC42">Solving Ambiguities</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
While this single name domain work good for most applications there
 | 
						|
might be the need to get translations from more than one domain.  Of
 | 
						|
course one could switch between different domains with calls to
 | 
						|
<CODE>textdomain</CODE>, but this is really not convenient nor is it fast.  A
 | 
						|
possible situation could be one case discussing while this writing:  all
 | 
						|
error messages of functions in the set of common used functions should
 | 
						|
go into a separate domain <CODE>error</CODE>.  By this mean we would only need
 | 
						|
to translate them once.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
For this reasons there are two more functions to retrieve strings:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
char *dgettext (const char *domain_name, const char *msgid);
 | 
						|
char *dcgettext (const char *domain_name, const char *msgid,
 | 
						|
                 int category);
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Both take an additional argument at the first place, which corresponds
 | 
						|
to the argument of <CODE>textdomain</CODE>.  The third argument of
 | 
						|
<CODE>dcgettext</CODE> allows to use another locale but <CODE>LC_MESSAGES</CODE>.
 | 
						|
But I really don't know where this can be useful.  If the
 | 
						|
<VAR>domain_name</VAR> is <CODE>NULL</CODE> or <VAR>category</VAR> has an value beside
 | 
						|
the known ones, the result is undefined.  It should also be noted that
 | 
						|
this function is not part of the second known implementation of this
 | 
						|
function family, the one found in Solaris.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
A second ambiguity can arise by the fact, that perhaps more than one
 | 
						|
domain has the same name.  This can be solved by specifying where the
 | 
						|
needed message catalog files can be found.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
char *bindtextdomain (const char *domain_name,
 | 
						|
                      const char *dir_name);
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Calling this function binds the given domain to a file in the specified
 | 
						|
directory (how this file is determined follows below).  Esp a file in
 | 
						|
the systems default place is not favored against the specified file
 | 
						|
anymore (as it would be by solely using <CODE>textdomain</CODE>).  A <CODE>NULL</CODE>
 | 
						|
pointer for the <VAR>dir_name</VAR> parameter returns the binding associated
 | 
						|
with <VAR>domain_name</VAR>.  If <VAR>domain_name</VAR> itself is <CODE>NULL</CODE>
 | 
						|
nothing happens and a <CODE>NULL</CODE> pointer is returned.  Here again as
 | 
						|
for all the other functions is true that none of the return value must
 | 
						|
be changed!
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC43" HREF="gettext_toc.html#TOC43">Locating Message Catalog Files</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
Because many different languages for many different packages have to be
 | 
						|
stored we need some way to add these information to file message catalog
 | 
						|
files.  The way usually used in Unix environments is have this encoding
 | 
						|
in the file name.  This is also done here.  The directory name given in
 | 
						|
<CODE>bindtextdomain</CODE>s second argument (or the default directory),
 | 
						|
followed by the value and name of the locale and the domain name are
 | 
						|
concatenated:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
<VAR>dir_name</VAR>/<VAR>locale</VAR>/LC_<VAR>category</VAR>/<VAR>domain_name</VAR>.mo
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The default value for <VAR>dir_name</VAR> is system specific.  For the GNU
 | 
						|
library it's:
 | 
						|
 | 
						|
<PRE>
 | 
						|
/usr/local/share/locale
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
<VAR>locale</VAR> is the value of the locale whose name is this
 | 
						|
<CODE>LC_<VAR>category</VAR></CODE>.  For <CODE>gettext</CODE> and <CODE>dgettext</CODE> this
 | 
						|
locale is always <CODE>LC_MESSAGES</CODE>.  <CODE>dcgettext</CODE> specifies the
 | 
						|
locale by the third argument.<A NAME="DOCF2" HREF="gettext_foot.html#FOOT2">(2)</A> <A NAME="DOCF3" HREF="gettext_foot.html#FOOT3">(3)</A>
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC44" HREF="gettext_toc.html#TOC44">Optimization of the *gettext functions</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
At this point of the discussion we should talk about an advantage of the
 | 
						|
GNU <CODE>gettext</CODE> implementation.  Some readers might have pointed out
 | 
						|
that an internationalized program might have a poor performance if some
 | 
						|
string has to be translated in an inner loop.  While this is unavoidable
 | 
						|
when the string varies from one run of the loop to the other it is
 | 
						|
simply a waste of time when the string is always the same.  Take the
 | 
						|
following example:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
{
 | 
						|
  while (...)
 | 
						|
    {
 | 
						|
      puts (gettext ("Hello world"));
 | 
						|
    }
 | 
						|
}
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
When the locale selection does not change between two runs the resulting
 | 
						|
string is always the same.  One way to use this is:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
{
 | 
						|
  str = gettext ("Hello world");
 | 
						|
  while (...)
 | 
						|
    {
 | 
						|
      puts (str);
 | 
						|
    }
 | 
						|
}
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
But this solution is not usable in all situation (e.g. when the locale
 | 
						|
selection changes) nor is it good readable.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The GNU C compiler, version 2.7 and above, provide another solution for
 | 
						|
this.  To describe this we show here some lines of the
 | 
						|
<TT>`intl/libgettext.h'</TT> file.  For an explanation of the expression
 | 
						|
command block see section `Statements and Declarations in Expressions' in <CITE>The GNU CC Manual</CITE>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#  if defined __GNUC__ && __GNUC__ == 2 && __GNUC_MINOR__ >= 7
 | 
						|
#   define	dcgettext(domainname, msgid, category)           \
 | 
						|
  (__extension__                                                 \
 | 
						|
   ({                                                            \
 | 
						|
     char *result;                                               \
 | 
						|
     if (__builtin_constant_p (msgid))                           \
 | 
						|
       {                                                         \
 | 
						|
         extern int _nl_msg_cat_cntr;                            \
 | 
						|
         static char *__translation__;                           \
 | 
						|
         static int __catalog_counter__;                         \
 | 
						|
         if (! __translation__                                   \
 | 
						|
             || __catalog_counter__ != _nl_msg_cat_cntr)         \
 | 
						|
           {                                                     \
 | 
						|
             __translation__ =                                   \
 | 
						|
               dcgettext__ ((domainname), (msgid), (category));  \
 | 
						|
             __catalog_counter__ = _nl_msg_cat_cntr;             \
 | 
						|
           }                                                     \
 | 
						|
         result = __translation__;                               \
 | 
						|
       }                                                         \
 | 
						|
     else                                                        \
 | 
						|
       result = dcgettext__ ((domainname), (msgid), (category)); \
 | 
						|
     result;                                                     \
 | 
						|
    }))
 | 
						|
#  endif
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The interesting thing here is the <CODE>__builtin_constant_p</CODE> predicate.
 | 
						|
This is evaluated at compile time and so optimization can take place
 | 
						|
immediately.  Here two cases are distinguished: the argument to
 | 
						|
<CODE>gettext</CODE> is not a constant value in which case simply the function
 | 
						|
<CODE>dcgettext__</CODE> is called, the real implementation of the
 | 
						|
<CODE>dcgettext</CODE> function.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If the string argument <EM>is</EM> constant we can reuse the once gained
 | 
						|
translation when the locale selection has not changed.  This is exactly
 | 
						|
what is done here.  The <CODE>_nl_msg_cat_cntr</CODE> variable is defined in
 | 
						|
the <TT>`loadmsgcat.c'</TT> which is available in <TT>`libintl.a'</TT> and is
 | 
						|
changed whenever a new message catalog is loaded.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC45" HREF="gettext_toc.html#TOC45">Comparing the Two Interfaces</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
The following discussion is perhaps a little bit colored.  As said
 | 
						|
above we implemented GNU <CODE>gettext</CODE> following the Uniforum
 | 
						|
proposal and this surely has its reasons.  But it should show how we
 | 
						|
came to this decision.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
First we take a look at the developing process.  When we write an
 | 
						|
application using NLS provided by <CODE>gettext</CODE> we proceed as always.
 | 
						|
Only when we come to a string which might be seen by the users and thus
 | 
						|
has to be translated we use <CODE>gettext("...")</CODE> instead of
 | 
						|
<CODE>"..."</CODE>.  At the beginning of each source file (or in a central
 | 
						|
header file) we define
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#define gettext(String) (String)
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Even this definition can be avoided when the system supports the
 | 
						|
<CODE>gettext</CODE> function in its C library.  When we compile this code the
 | 
						|
result is the same as if no NLS code is used.  When  you take a look at
 | 
						|
the GNU <CODE>gettext</CODE> code you will see that we use <CODE>_("...")</CODE>
 | 
						|
instead of <CODE>gettext("...")</CODE>.  This reduces the number of
 | 
						|
additional characters per translatable string to <EM>3</EM> (in words:
 | 
						|
three).
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
When now a production version of the program is needed we simply replace
 | 
						|
the definition
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#define _(String) (String)
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
by
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#include <libintl.h>
 | 
						|
#define _(String) gettext (String)
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
and include the header <TT>`libintl.h'</TT>.  Additionally we run the
 | 
						|
program <TT>`xgettext'</TT> on all source code file which contain
 | 
						|
translatable strings and we are gone.  We have a running program which
 | 
						|
does not depend on translations to be available, but which can use any
 | 
						|
that becomes available.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The same procedure can be done for the <CODE>gettext_noop</CODE> invocations
 | 
						|
(see section <A HREF="gettext.html#SEC17">Special Cases of Translatable Strings</A>).  First you can define <CODE>gettext_noop</CODE> to a
 | 
						|
no-op macro and later use the definition from <TT>`libintl.h'</TT>.  Because
 | 
						|
this name is not used in Suns implementation of <TT>`libintl.h'</TT>,
 | 
						|
you should consider the following code for your project:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
#ifdef gettext_noop
 | 
						|
# define N_(Str) gettext_noop (Str)
 | 
						|
#else
 | 
						|
# define N_(Str) (Str)
 | 
						|
#endif
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
<CODE>N_</CODE> is a short form similar to <CODE>_</CODE>.  The <TT>`Makefile'</TT> in
 | 
						|
the <TT>`po/'</TT> directory of GNU gettext knows by default both of the
 | 
						|
mentioned short forms so you are invited to follow this proposal for
 | 
						|
your own ease.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Now to <CODE>catgets</CODE>.  The main problem is the work for the
 | 
						|
programmer.  Every time he comes to a translatable string he has to
 | 
						|
define a number (or a symbolic constant) which has also be defined in
 | 
						|
the message catalog file.  He also has to take care for duplicate
 | 
						|
entries, duplicate message IDs etc.  If he wants to have the same
 | 
						|
quality in the message catalog as the GNU <CODE>gettext</CODE> program
 | 
						|
provides he also has to put the descriptive comments for the strings and
 | 
						|
the location in all source code files in the message catalog.  This is
 | 
						|
nearly a Mission: Impossible.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
But there are also some points people might call advantages speaking for
 | 
						|
<CODE>catgets</CODE>.  If you have a single word in a string and this string
 | 
						|
is used in different contexts it is likely that in one or the other
 | 
						|
language the word has different translations.  Example:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
printf ("%s: %d", gettext ("number"), number_of_errors)
 | 
						|
 | 
						|
printf ("you should see %d %s", number_count,
 | 
						|
        number_count == 1 ? gettext ("number") : gettext ("numbers"))
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Here we have to translate two times the string <CODE>"number"</CODE>.  Even
 | 
						|
if you do not speak a language beside English it might be possible to
 | 
						|
recognize that the two words have a different meaning.  In German the
 | 
						|
first appearance has to be translated to <CODE>"Anzahl"</CODE> and the second
 | 
						|
to <CODE>"Zahl"</CODE>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Now you can say that this example is really esoteric.  And you are
 | 
						|
right!  This is exactly how we felt about this problem and decide that
 | 
						|
it does not weight that much.  The solution for the above problem could
 | 
						|
be very easy:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
printf (gettext ("number: %d"), number_of_errors)
 | 
						|
 | 
						|
printf (number_count == 1 ? gettext ("you should see %d number")
 | 
						|
                          : gettext ("you should see %d numbers"),
 | 
						|
        number_count)
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
We believe that we can solve all conflicts with this method.  If it is
 | 
						|
difficult one can also consider changing one of the conflicting string a
 | 
						|
little bit.  But it is not impossible to overcome.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Translator note: It is perhaps appropriate here to tell those English
 | 
						|
speaking programmers that the plural form of a noun cannot be formed by
 | 
						|
appending a single `s'.  Most other languages use different methods.  So
 | 
						|
you should at least use the method given in the above example.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
But I have been told that some languages have even more complex rules.
 | 
						|
A good approach might be to consider methods like the one used for
 | 
						|
<CODE>LC_TIME</CODE> in the POSIX.2 standard.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC46" HREF="gettext_toc.html#TOC46">Using libintl.a in own programs</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Starting with version 0.9.4 the library <CODE>libintl.h</CODE> should be more
 | 
						|
or less self-contained.  I.e. you can use it in your own programs.  The
 | 
						|
<TT>`Makefile'</TT> will put the header and the library in directories
 | 
						|
selected using the <CODE>$(prefix)</CODE>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
One exception of the above is found on HP-UX systems.  Here the C library
 | 
						|
does not contain the <CODE>alloca</CODE> function (and the HP compiler does
 | 
						|
not generate it inlined).  But it is not intended to rewrite the whole
 | 
						|
library just because of this dumb system.  Instead include the
 | 
						|
<CODE>alloca</CODE> function in all package you use the <CODE>libintl.a</CODE> in.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC47" HREF="gettext_toc.html#TOC47">Being a <CODE>gettext</CODE> grok</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
To fully exploit the functionality of the GNU <CODE>gettext</CODE> library it
 | 
						|
is surely helpful to read the source code.  But for those who don't want
 | 
						|
to spend that much time in reading the (sometimes complicated) code here
 | 
						|
is a list comments:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<UL>
 | 
						|
<LI>Changing the language at runtime
 | 
						|
 | 
						|
For interactive programs it might be useful to offer a selection of the
 | 
						|
used language at runtime.  To understand how to do this one need to know
 | 
						|
how the used language is determined while executing the <CODE>gettext</CODE>
 | 
						|
function.  The method which is presented here only works correctly
 | 
						|
with the GNU implementation of the <CODE>gettext</CODE> functions.  It is not
 | 
						|
possible with underlying <CODE>catgets</CODE> functions or <CODE>gettext</CODE>
 | 
						|
functions from the systems C library.  The exception is of course the
 | 
						|
GNU C Library which uses the GNU gettext Library for message handling.
 | 
						|
 | 
						|
In the function <CODE>dcgettext</CODE> at every call the current setting of
 | 
						|
the highest priority environment variable is determined and used.
 | 
						|
Highest priority means here the following list with decreasing
 | 
						|
priority:
 | 
						|
 | 
						|
 | 
						|
<OL>
 | 
						|
<LI><CODE>LANGUAGE</CODE>
 | 
						|
 | 
						|
<LI><CODE>LC_ALL</CODE>
 | 
						|
 | 
						|
<LI><CODE>LC_xxx</CODE>, according to selected locale
 | 
						|
 | 
						|
<LI><CODE>LANG</CODE>
 | 
						|
 | 
						|
</OL>
 | 
						|
 | 
						|
Afterwards the path is constructed using the found value and the
 | 
						|
translation file is loaded if available.
 | 
						|
 | 
						|
What is now when the value for, say, <CODE>LANGUAGE</CODE> changes.  According
 | 
						|
to the process explained above the new value of this variable is found
 | 
						|
as soon as the <CODE>dcgettext</CODE> function is called.  But this also means
 | 
						|
the (perhaps) different message catalog file is loaded.  In other
 | 
						|
words: the used language is changed.
 | 
						|
 | 
						|
But there is one little hook.  The code for gcc-2.7.0 and up provides
 | 
						|
some optimization.  This optimization normally prevents the calling of
 | 
						|
the <CODE>dcgettext</CODE> function as long as now new catalog is loaded.  But
 | 
						|
if <CODE>dcgettext</CODE> is not called we program also cannot find the
 | 
						|
<CODE>LANGUAGE</CODE> variable be changed (see section <A HREF="gettext.html#SEC44">Optimization of the *gettext functions</A>).  But the
 | 
						|
solution is very easy.  Include the following code in the language
 | 
						|
switching function.
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
  /* Change language.  */
 | 
						|
  setenv ("LANGUAGE", "fr", 1);
 | 
						|
 | 
						|
  /* Make change known.  */
 | 
						|
  {
 | 
						|
    extern int  _nl_msg_cat_cntr;
 | 
						|
    ++_nl_msg_cat_cntr;
 | 
						|
  }
 | 
						|
</PRE>
 | 
						|
 | 
						|
The variable <CODE>_nl_msg_cat_cntr</CODE> is defined in <TT>`loadmsgcat.c'</TT>.
 | 
						|
 | 
						|
</UL>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC48" HREF="gettext_toc.html#TOC48">Temporary Notes for the Programmers Chapter</A></H2>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC49" HREF="gettext_toc.html#TOC49">Temporary - Two Possible Implementations</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
There are two competing methods for language independent messages:
 | 
						|
the X/Open <CODE>catgets</CODE> method, and the Uniforum <CODE>gettext</CODE>
 | 
						|
method.  The <CODE>catgets</CODE> method indexes messages by integers; the
 | 
						|
<CODE>gettext</CODE> method indexes them by their English translations.
 | 
						|
The <CODE>catgets</CODE> method has been around longer and is supported
 | 
						|
by more vendors.  The <CODE>gettext</CODE> method is supported by Sun,
 | 
						|
and it has been heard that the COSE multi-vendor initiative is
 | 
						|
supporting it.  Neither method is a POSIX standard; the POSIX.1
 | 
						|
committee had a lot of disagreement in this area.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Neither one is in the POSIX standard.  There was much disagreement
 | 
						|
in the POSIX.1 committee about using the <CODE>gettext</CODE> routines
 | 
						|
vs. <CODE>catgets</CODE> (XPG).  In the end the committee couldn't
 | 
						|
agree on anything, so no messaging system was included as part
 | 
						|
of the standard.  I believe the informative annex of the standard
 | 
						|
includes the XPG3 messaging interfaces, "...as an example of
 | 
						|
a messaging system that has been implemented..."
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
They were very careful not to say anywhere that you should use one
 | 
						|
set of interfaces over the other.  For more on this topic please
 | 
						|
see the Programming for Internationalization FAQ.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC50" HREF="gettext_toc.html#TOC50">Temporary - About <CODE>catgets</CODE></A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
There have been a few discussions of late on the use of
 | 
						|
<CODE>catgets</CODE> as a base.  I think it important to present both
 | 
						|
sides of the argument and hence am opting to play devil's advocate
 | 
						|
for a little bit.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
I'll not deny the fact that <CODE>catgets</CODE> could have been designed
 | 
						|
a lot better.  It currently has quite a number of limitations and
 | 
						|
these have already been pointed out.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
However there is a great deal to be said for consistency and
 | 
						|
standardization.  A common recurring problem when writing Unix
 | 
						|
software is the myriad portability problems across Unix platforms.
 | 
						|
It seems as if every Unix vendor had a look at the operating system
 | 
						|
and found parts they could improve upon.  Undoubtedly, these
 | 
						|
modifications are probably innovative and solve real problems.
 | 
						|
However, software developers have a hard time keeping up with all
 | 
						|
these changes across so many platforms.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
And this has prompted the Unix vendors to begin to standardize their
 | 
						|
systems.  Hence the impetus for Spec1170.  Every major Unix vendor
 | 
						|
has committed to supporting this standard and every Unix software
 | 
						|
developer waits with glee the day they can write software to this
 | 
						|
standard and simply recompile (without having to use autoconf)
 | 
						|
across different platforms.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
As I understand it, Spec1170 is roughly based upon version 4 of the
 | 
						|
X/Open Portability Guidelines (XPG4).  Because <CODE>catgets</CODE> and
 | 
						|
friends are defined in XPG4, I'm led to believe that <CODE>catgets</CODE>
 | 
						|
is a part of Spec1170 and hence will become a standardized component
 | 
						|
of all Unix systems.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC51" HREF="gettext_toc.html#TOC51">Temporary - Why a single implementation</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
Now it seems kind of wasteful to me to have two different systems
 | 
						|
installed for accessing message catalogs.  If we do want to remedy
 | 
						|
<CODE>catgets</CODE> deficiencies why don't we try to expand <CODE>catgets</CODE>
 | 
						|
(in a compatible manner) rather than implement an entirely new system.
 | 
						|
Otherwise, we'll end up with two message catalog access systems
 | 
						|
installed with an operating system - one set of routines for GNU
 | 
						|
software, and another set of routines (catgets) for all other software.
 | 
						|
Bloated?
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Supposing another catalog access system is implemented.  Which do
 | 
						|
we recommend?  At least for Linux, we need to attract as many
 | 
						|
software developers as possible.  Hence we need to make it as easy
 | 
						|
for them to port their software as possible.  Which means supporting
 | 
						|
<CODE>catgets</CODE>.  We will be implementing the <CODE>glocale</CODE> code
 | 
						|
within our <CODE>libc</CODE>, but does this mean we also have to incorporate
 | 
						|
another message catalog access scheme within our <CODE>libc</CODE> as well?
 | 
						|
And what about people who are going to be using the <CODE>glocale</CODE>
 | 
						|
+ non-<CODE>catgets</CODE> routines.  When they port their software to
 | 
						|
other platforms, they're now going to have to include the front-end
 | 
						|
(<CODE>glocale</CODE>) code plus the back-end code (the non-<CODE>catgets</CODE>
 | 
						|
access routines) with their software instead of just including the
 | 
						|
<CODE>glocale</CODE> code with their software.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Message catalog support is however only the tip of the iceberg.
 | 
						|
What about the data for the other locale categories.  They also have
 | 
						|
a number of deficiencies.  Are we going to abandon them as well and
 | 
						|
develop another duplicate set of routines (should <CODE>glocale</CODE>
 | 
						|
expand beyond message catalog support)?
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Like many parts of Unix that can be improved upon, we're stuck with balancing
 | 
						|
compatibility with the past with useful improvements and innovations for
 | 
						|
the future.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC52" HREF="gettext_toc.html#TOC52">Temporary - Double layer solution</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
GNU locale implements a <CODE>gettext</CODE>-style interface on top of a
 | 
						|
<CODE>catgets</CODE>-style interface.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This is not needless complexity.  It is absolutely vital, because
 | 
						|
it enables <CODE>gettext</CODE> to run on top of <CODE>catgets</CODE>, which
 | 
						|
enables Linux International to recommend users use it <EM>today</EM>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Rewriting <CODE>gettext</CODE> so that it could use <EM>either</EM>
 | 
						|
<CODE>catgets</CODE> <EM>or</EM> some simpler mechanism would not break
 | 
						|
anything, but would not reduce complexity either.  It might be
 | 
						|
worth doing, but it isn't urgent.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
In general, simplicity is not enough of a reason to rewrite a
 | 
						|
program that works.  Simplicity is just one desirable thing.
 | 
						|
It is not overridingly important.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC53" HREF="gettext_toc.html#TOC53">Temporary - Notes</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
X/Open agreed very late on the standard form so that many
 | 
						|
implementations differ from the final form.  Both of my system (old
 | 
						|
Linux catgets and Ultrix-4) have a strange variation.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
OK.  After incorporating the last changes I have to spend some time on
 | 
						|
making the GNU/Linux libc gettext functions.  So in future Solaris is
 | 
						|
not the only system having gettext.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC54" HREF="gettext_toc.html#TOC54">The Translator's View</A></H1>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC55" HREF="gettext_toc.html#TOC55">Introduction 0</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
GNU is going international!  The GNU Translation Project is a way
 | 
						|
to get maintainers, translators and users all together, so GNU will
 | 
						|
gradually become able to speak many native languages.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The GNU <CODE>gettext</CODE> tool set contains <EM>everything</EM> maintainers
 | 
						|
need for internationalizing their packages for messages.  It also
 | 
						|
contains quite useful tools for helping translators at localizing
 | 
						|
messages to their native language, once a package has already been
 | 
						|
internationalized.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
To achieve the GNU Translation Project, we need many interested
 | 
						|
people who like their own language and write it well, and who are also
 | 
						|
able to synergize with other translators speaking the same language.
 | 
						|
If you'd like to volunteer to <EM>work</EM> at translating messages,
 | 
						|
please send mail to your translating team.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Each team has its own mailing list, courtesy of Linux
 | 
						|
International.  You may reach your translating team at the address
 | 
						|
<TT>`<VAR>ll</VAR>@li.org'</TT>, replacing <VAR>ll</VAR> by the two-letter ISO 639
 | 
						|
code for your language.  Language codes are <EM>not</EM> the same as
 | 
						|
country codes given in ISO 3166.  The following translating teams
 | 
						|
exist:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<BLOCKQUOTE>
 | 
						|
<P>
 | 
						|
Chinese <CODE>zh</CODE>, Czech <CODE>cs</CODE>, Danish <CODE>da</CODE>, Dutch <CODE>nl</CODE>,
 | 
						|
Esperanto <CODE>eo</CODE>, Finnish <CODE>fi</CODE>, French <CODE>fr</CODE>, Irish
 | 
						|
<CODE>ga</CODE>, German <CODE>de</CODE>, Greek <CODE>el</CODE>, Italian <CODE>it</CODE>,
 | 
						|
Japanese <CODE>ja</CODE>, Indonesian <CODE>in</CODE>, Norwegian <CODE>no</CODE>, Polish
 | 
						|
<CODE>pl</CODE>, Portuguese <CODE>pt</CODE>, Russian <CODE>ru</CODE>, Spanish <CODE>es</CODE>,
 | 
						|
Swedish <CODE>sv</CODE> and Turkish <CODE>tr</CODE>.
 | 
						|
</BLOCKQUOTE>
 | 
						|
 | 
						|
<P>
 | 
						|
For example, you may reach the Chinese translating team by writing to
 | 
						|
<TT>`zh@li.org'</TT>.  When you become a member of the translating team
 | 
						|
for your own language, you may subscribe to its list.  For example,
 | 
						|
Swedish people can send a message to <TT>`sv-request@li.org'</TT>,
 | 
						|
having this message body:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
subscribe
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Keep in mind that team members should be interested in <EM>working</EM>
 | 
						|
at translations, or at solving translational difficulties, rather than
 | 
						|
merely lurking around.  If your team does not exist yet and you want to
 | 
						|
start one, please write to <TT>`gnu-translation@prep.ai.mit.edu'</TT>;
 | 
						|
you will then reach the GNU coordinator for all translator teams.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
A handful of GNU packages have already been adapted and provided
 | 
						|
with message translations for several languages.  Translation
 | 
						|
teams have begun to organize, using these packages as a starting
 | 
						|
point.  But there are many more packages and many languages for
 | 
						|
which we have no volunteer translators.  If you would like to
 | 
						|
volunteer to work at translating messages, please send mail to
 | 
						|
<TT>`gnu-translation@prep.ai.mit.edu'</TT> indicating what language(s)
 | 
						|
you can work on.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC56" HREF="gettext_toc.html#TOC56">Introduction 1</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
This is now official, GNU is going international!  Here is the
 | 
						|
announcement submitted for the January 1995 GNU Bulletin:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<BLOCKQUOTE>
 | 
						|
<P>
 | 
						|
A handful of GNU packages have already been adapted and provided
 | 
						|
with message translations for several languages.  Translation
 | 
						|
teams have begun to organize, using these packages as a starting
 | 
						|
point.  But there are many more packages and many languages
 | 
						|
for which we have no volunteer translators.  If you'd like to
 | 
						|
volunteer to work at translating messages, please send mail to
 | 
						|
<SAMP>`gnu-translation@prep.ai.mit.edu'</SAMP> indicating what language(s)
 | 
						|
you can work on.
 | 
						|
</BLOCKQUOTE>
 | 
						|
 | 
						|
<P>
 | 
						|
This document should answer many questions for those who are curious
 | 
						|
about the process or would like to contribute.  Please at least skim
 | 
						|
over it, hoping to cut down a little of the high volume of email
 | 
						|
generated by this collective effort towards GNU internationalization.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
GNU programming is done in English, and currently, English is used
 | 
						|
as the main communicating language between national communities
 | 
						|
collaborating to the GNU project.  This very document is written
 | 
						|
in English.  This will not change in the foreseeable future.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
However, there is a strong appetite from national communities for
 | 
						|
having more software able to write using national language and habits,
 | 
						|
and there is an on-going effort to modify GNU software in such a way
 | 
						|
that it becomes able to do so.  The experiments driven so far raised
 | 
						|
an enthusiastic response from pretesters, so we believe that GNU
 | 
						|
internationalization is dedicated to succeed.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
For suggestion clarifications, additions or corrections to this
 | 
						|
document, please email to <TT>`gnu-translation@prep.ai.mit.edu'</TT>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC57" HREF="gettext_toc.html#TOC57">Discussions</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Facing this internationalization effort, a few users expressed their
 | 
						|
concerns.  Some of these doubts are presented and discussed, here.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<UL>
 | 
						|
<LI>Smaller groups
 | 
						|
 | 
						|
Some languages are not spoken by a very large number of people,
 | 
						|
so people speaking them sometimes consider that there may not be
 | 
						|
all that much demand such versions of GNU packages.  Moreover, many
 | 
						|
people being <EM>into computers</EM>, in some countries, generally seem
 | 
						|
to prefer English versions of their software.
 | 
						|
 | 
						|
On the other end, people might enjoy their own language a lot, and
 | 
						|
be very motivated at providing to themselves the pleasure of having
 | 
						|
their beloved GNU software speaking their mother tongue.  They do
 | 
						|
themselves a personal favor, and do not pay that much attention to
 | 
						|
the number of people beneficiating of their work.
 | 
						|
 | 
						|
<LI>Misinterpretation
 | 
						|
 | 
						|
Other users are shy to push forward their own language, seeing in this
 | 
						|
some kind of misplaced propaganda.  Someone thought there must be some
 | 
						|
users of the language over the networks pestering other people with it.
 | 
						|
 | 
						|
But any spoken language is worth localization, because there are
 | 
						|
people behind the language for whom the language is important and
 | 
						|
dear to their hearts.
 | 
						|
 | 
						|
<LI>Odd translations
 | 
						|
 | 
						|
The biggest problem is to find the right translations so that
 | 
						|
everybody can understand the messages.  Translations are usually a
 | 
						|
little odd.  Some people get used to English, to the extent they may
 | 
						|
find translations into their own language "rather pushy, obnoxious
 | 
						|
and sometimes even hilarious."  As a French speaking man, I have
 | 
						|
the experience of those instruction manuals for goods, so poorly
 | 
						|
translated in French in Korea or Taiwan...
 | 
						|
 | 
						|
The fact is that we sometimes have to create a kind of national
 | 
						|
computer culture, and this is not easy without the collaboration of
 | 
						|
many people liking their mother tongue.  This is why translations are
 | 
						|
better achieved by people knowing and loving their own language, and
 | 
						|
ready to work together at improving the results they obtain.
 | 
						|
 | 
						|
<LI>Dependencies over the GPL
 | 
						|
 | 
						|
Some people wonder if using GNU <CODE>gettext</CODE> necessarily brings their package
 | 
						|
under the protective wing of the GNU General Public License, when they
 | 
						|
do not want to make their program free, or want other kinds of freedom.
 | 
						|
The simplest answer is yes.
 | 
						|
 | 
						|
The mere marking of localizable strings in a package, or conditional
 | 
						|
inclusion of a few lines for initialization, is not really including
 | 
						|
GPL'ed code.  However, the localization routines themselves are under
 | 
						|
the GPL and would bring the remainder of the package under the GPL
 | 
						|
if they were distributed with it.  So, I presume that, for those
 | 
						|
for which this is a problem, it could be circumvented by letting to
 | 
						|
the end installers the burden of assembling a package prepared for
 | 
						|
localization, but not providing the localization routines themselves.
 | 
						|
 | 
						|
</UL>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC58" HREF="gettext_toc.html#TOC58">Organization</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
On a larger scale, the true solution would be to organize some kind of
 | 
						|
fairly precise set up in which volunteers could participate.  I gave
 | 
						|
some thought to this idea lately, and realize there will be some
 | 
						|
touchy points.  I thought of writing to Richard Stallman to launch
 | 
						|
such a project, but feel it might be good to shake out the ideas
 | 
						|
between ourselves first.  Most probably that Linux International has
 | 
						|
some experience in the field already, or would like to orchestrate
 | 
						|
the volunteer work, maybe.  Food for thought, in any case!
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
I guess we have to setup something early, somehow, that will help
 | 
						|
many possible contributors of the same language to interlock and avoid
 | 
						|
work duplication, and further be put in contact for solving together
 | 
						|
problems particular to their tongue (in most languages, there are many
 | 
						|
difficulties peculiar to translating technical English).  My Swedish
 | 
						|
contributor acknowledged these difficulties, and I'm well aware of
 | 
						|
them for French.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This is surely not a technical issue, but we should manage so the
 | 
						|
effort of locale contributors be maximally useful, despite the national
 | 
						|
team layer interface between contributors and maintainers.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
GNU needs some setup for coordinating language coordinators.
 | 
						|
Localizing evolving GNU programs will surely become a permanent
 | 
						|
and continuous activity in GNU, once started.  The setup should be
 | 
						|
minimally completed and tested before GNU <CODE>gettext</CODE> becomes an official
 | 
						|
reality.  The email address <TT>`gnu-translation@prep.ai.mit.edu'</TT>
 | 
						|
has been setup for receiving offers from volunteers and general
 | 
						|
email on these topics.  This address reaches the GNU Translation
 | 
						|
Project coordinator.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC59" HREF="gettext_toc.html#TOC59">Central Coordination</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
I also think GNU will need sooner than it thinks, that someone setup
 | 
						|
a way to organize and coordinate these groups.  Some kind of group
 | 
						|
of groups.  My opinion is that it would be good that GNU delegate
 | 
						|
this task to a small group of collaborating volunteers, shortly.
 | 
						|
Perhaps in <TT>`gnu.announce'</TT> a list of this national committee's
 | 
						|
can be published.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
My role as coordinator would simply be to refer to Ulrich any German
 | 
						|
speaking volunteer interested to localization of GNU programs, and
 | 
						|
maybe helping national groups to initially organize, while maintaining
 | 
						|
national registries for until national groups are ready to take over.
 | 
						|
In fact, the coordinator should ease volunteers to get in contact with
 | 
						|
one another for creating national teams, which should then select
 | 
						|
one coordinator per language, or country (regionalized language).
 | 
						|
If well done, the coordination should be useful without being an
 | 
						|
overwhelming task, the time to put delegations in place.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC60" HREF="gettext_toc.html#TOC60">National Teams</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
I suggest we look for volunteer coordinators/editors for individual
 | 
						|
languages.  These people will scan contributions of translation files
 | 
						|
for various programs, for their own languages, and will ensure high
 | 
						|
and uniform standards of diction.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
From my current experience with other people in these days, those who
 | 
						|
provide localizations are very enthusiastic about the process, and are
 | 
						|
more interested in the localization process than in the program they
 | 
						|
localize, and want to do many programs, not just one.  This seems
 | 
						|
to confirm that having a coordinator/editor for each language is a
 | 
						|
good idea.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
We need to choose someone who is good at writing clear and concise
 | 
						|
prose in the language in question.  That is hard--we can't check
 | 
						|
it ourselves.  So we need to ask a few people to judge each others'
 | 
						|
writing and select the one who is best.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
I announce my prerelease to a few dozen people, and you would not
 | 
						|
believe all the discussions it generated already.  I shudder to think
 | 
						|
what will happen when this will be launched, for true, officially,
 | 
						|
world wide.  Who am I to arbitrate between two Czekolsovak users
 | 
						|
contradicting each other, for example?
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
I assume that your German is not much better than my French so that
 | 
						|
I would not be able to judge about these formulations.  What I would
 | 
						|
suggest is that for each language there is a group for people who
 | 
						|
maintain the PO files and judge about changes.  I suspect there will
 | 
						|
be cultural differences between how such groups of people will behave.
 | 
						|
Some will have relaxed ways, reach consensus easily, and have anyone
 | 
						|
of the group relate to the maintainers, while others will fight to
 | 
						|
death, organize heavy administrations up to national standards, and
 | 
						|
use strict channels.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The German team is putting out a good example.  Right now, they are
 | 
						|
maybe half a dozen people revising translations of each other and
 | 
						|
discussing the linguistic issues.  I do not even have all the names.
 | 
						|
Ulrich Drepper is taking care of coordinating the German team.
 | 
						|
He subscribed to all my pretest lists, so I do not even have to warn
 | 
						|
him specifically of incoming releases.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
I'm sure, that is a good idea to get teams for each language working
 | 
						|
on translations. That will make the translations better and more
 | 
						|
consistent.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H4><A NAME="SEC61" HREF="gettext_toc.html#TOC61">Sub-Cultures</A></H4>
 | 
						|
 | 
						|
<P>
 | 
						|
Taking French for example, there are a few sub-cultures around
 | 
						|
computers which developed diverging vocabularies.  Picking volunteers
 | 
						|
here and there without addressing this problem in an organized way,
 | 
						|
soon in the project, might produce a distasteful mix of GNU programs,
 | 
						|
and possibly trigger endless quarrels among those who really care.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Keeping some kind of unity in the way French localization of GNU
 | 
						|
programs is achieved is a difficult (and delicate) job.  Knowing the
 | 
						|
latin character of French people (:-), if we take this the wrong
 | 
						|
way, we could end up nowhere, or spoil a lot of energies.  Maybe we
 | 
						|
should begin to address this problem seriously <EM>before</EM> GNU
 | 
						|
<CODE>gettext</CODE> become officially published.  And I suspect that this
 | 
						|
means soon!
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H4><A NAME="SEC62" HREF="gettext_toc.html#TOC62">Organizational Ideas</A></H4>
 | 
						|
 | 
						|
<P>
 | 
						|
I expect the next big changes after the official release.  Please note
 | 
						|
that I use the German translation of the short GPL message.  We need
 | 
						|
to set a few good examples before the localization goes out for true
 | 
						|
in GNU.  Here are a few points to discuss:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<UL>
 | 
						|
<LI>
 | 
						|
 | 
						|
Each group should have one FTP server (at least one master).
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
The files on the server should reflect the latest version (of
 | 
						|
course!) and it should also contain a RCS directory with the
 | 
						|
corresponding archives (I don't have this now).
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
There should also be a ChangeLog file (this is more useful than the
 | 
						|
RCS archive but can be generated automatically from the later by
 | 
						|
Emacs).
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
A <STRONG>core group</STRONG> should judge about questionable changes (for now
 | 
						|
this group consists solely by me but I ask some others occasionally;
 | 
						|
this also seems to work).
 | 
						|
 | 
						|
</UL>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC63" HREF="gettext_toc.html#TOC63">Mailing Lists</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
If we get any inquiries about GNU <CODE>gettext</CODE>, send them on to:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
<TT>`gnu-translation@prep.ai.mit.edu'</TT>
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
The <TT>`*-pretest'</TT> lists are quite useful to me, maybe the idea could
 | 
						|
be generalized to all GNU packages.  But each maintainer his/her way!
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
, we have a mechanism in place here at
 | 
						|
<TT>`gnu.ai.mit.edu'</TT> to track teams, support mailing lists for
 | 
						|
them and log members.  We have a slight preference that you use it.
 | 
						|
If this is OK with you, I can get you clued in.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Things are changing!  A few years ago, when Daniel Fekete and I
 | 
						|
asked for a mailing list for GNU localization, nested at the FSF, we
 | 
						|
were politely invited to organize it anywhere else, and so did we.
 | 
						|
For communicating with my pretesters, I later made a handful of
 | 
						|
mailing lists located at iro.umontreal.ca and administrated by
 | 
						|
<CODE>majordomo</CODE>.  These lists have been <EM>very</EM> dependable
 | 
						|
so far...
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
I suspect that the German team will organize itself a mailing list
 | 
						|
located in Germany, and so forth for other countries.  But before they
 | 
						|
organize for true, it could surely be useful to offer mailing lists
 | 
						|
located at the FSF to each national team.  So yes, please explain me
 | 
						|
how I should proceed to create and handle them.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
We should create temporary mailing lists, one per country, to help
 | 
						|
people organize.  Temporary, because once regrouped and structured, it
 | 
						|
would be fair the volunteers from country bring back <EM>their</EM> list
 | 
						|
in there and manage it as they want.  My feeling is that, in the long
 | 
						|
run, each team should run its own list, from within their country.
 | 
						|
There also should be some central list to which all teams could
 | 
						|
subscribe as they see fit, as long as each team is represented in it.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC64" HREF="gettext_toc.html#TOC64">Information Flow</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
There will surely be some discussion about this messages after the
 | 
						|
packages are finally released.  If people now send you some proposals
 | 
						|
for better messages, how do you proceed?  Jim, please note that
 | 
						|
right now, as I put forward nearly a dozen of localizable programs, I
 | 
						|
receive both the translations and the coordination concerns about them.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If I put one of my things to pretest, Ulrich receives the announcement
 | 
						|
and passes it on to the German team, who make last minute revisions.
 | 
						|
Then he submits the translation files to me <EM>as the maintainer</EM>.
 | 
						|
For GNU packages I do not maintain, I would not even hear about it.
 | 
						|
This scheme could be made to work GNU-wide, I think.  For security
 | 
						|
reasons, maybe Ulrich (national coordinators, in fact) should update
 | 
						|
central registry kept by GNU (Jim, me, or Len's recruits) once in
 | 
						|
a while.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
In December/January, I was aggressively ready to internationalize
 | 
						|
all of GNU, giving myself the duty of one small GNU package per week
 | 
						|
or so, taking many weeks or months for bigger packages.  But it does
 | 
						|
not work this way.  I first did all the things I'm responsible for.
 | 
						|
I've nothing against some missionary work on other maintainers, but
 | 
						|
I'm also loosing a lot of energy over it--same debates over again.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
And when the first localized packages are released we'll get a lot of
 | 
						|
responses about ugly translations :-).  Surely, and we need to have
 | 
						|
beforehand a fairly good idea about how to handle the information
 | 
						|
flow between the national teams and the package maintainers.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Please start saving somewhere a quick history of each PO file.  I know
 | 
						|
for sure that the file format will change, allowing for comments.
 | 
						|
It would be nice that each file has a kind of log, and references for
 | 
						|
those who want to submit comments or gripes, or otherwise contribute.
 | 
						|
I sent a proposal for a fast and flexible format, but it is not
 | 
						|
receiving acceptance yet by the GNU deciders.  I'll tell you when I
 | 
						|
have more information about this.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC65" HREF="gettext_toc.html#TOC65">The Maintainer's View</A></H1>
 | 
						|
 | 
						|
<P>
 | 
						|
The maintainer of a package has many responsibilities.  One of them
 | 
						|
is ensuring that the package will install easily on many platforms,
 | 
						|
and that the magic we described earlier (see section <A HREF="gettext.html#SEC32">The User's View</A>) will work
 | 
						|
for installers and end users.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Of course, there are many possible ways by which GNU <CODE>gettext</CODE>
 | 
						|
might be integrated in a distribution, and this chapter does not cover
 | 
						|
them in all generality.  Instead, it details one possible approach
 | 
						|
which is especially adequate for many GNU distributions, because
 | 
						|
GNU <CODE>gettext</CODE> is purposely for helping the internationalization
 | 
						|
of the whole GNU project.  So, the maintainer's view presented here
 | 
						|
presumes that the package already has a <TT>`configure.in'</TT> file and
 | 
						|
uses Autoconf.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Nevertheless, GNU <CODE>gettext</CODE> may surely be useful for non-GNU
 | 
						|
packages, but the maintainers of such packages might have to show
 | 
						|
imagination and initiative in organizing their distributions so
 | 
						|
<CODE>gettext</CODE> work for them in all situations.  There are surely
 | 
						|
many, out there.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Even if <CODE>gettext</CODE> methods are now stabilizing, slight adjustments
 | 
						|
might be needed between successive <CODE>gettext</CODE> versions, so you
 | 
						|
should ideally revise this chapter in subsequent releases, looking
 | 
						|
for changes.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC66" HREF="gettext_toc.html#TOC66">Flat or Non-Flat Directory Structures</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Some GNU packages are distributed as <CODE>tar</CODE> files which unpack
 | 
						|
in a single directory, these are said to be <STRONG>flat</STRONG> distributions.
 | 
						|
Other GNU packages have a one level hierarchy of subdirectories, using
 | 
						|
for example a subdirectory named <TT>`doc/'</TT> for the Texinfo manual and
 | 
						|
man pages, another called <TT>`lib/'</TT> for holding functions meant to
 | 
						|
replace or complement C libraries, and a subdirectory <TT>`src/'</TT> for
 | 
						|
holding the proper sources for the package.  These other distributions
 | 
						|
are said to be <STRONG>non-flat</STRONG>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
For now, we cannot say much about flat distributions.  A flat
 | 
						|
directory structure has the disadvantage of increasing the difficulty
 | 
						|
of updating to a new version of GNU <CODE>gettext</CODE>.  Also, if you have
 | 
						|
many PO files, this could somewhat pollute your single directory.
 | 
						|
In the GNU <CODE>gettext</CODE> distribution, the <TT>`misc/'</TT> directory
 | 
						|
contains a shell script named <TT>`combine-sh'</TT>.  That script may
 | 
						|
be used for combining all the C files of the <TT>`intl/'</TT> directory
 | 
						|
into a pair of C files (one <TT>`.c'</TT> and one <TT>`.h'</TT>).  Those two
 | 
						|
generated files would fit more easily in a flat directory structure,
 | 
						|
and you will then have to add these two files to your project.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Maybe because GNU <CODE>gettext</CODE> itself has a non-flat structure,
 | 
						|
we have more experience with this approach, and this is what will be
 | 
						|
described in the remaining of this chapter.  Some maintainers might
 | 
						|
use this as an opportunity to unflatten their package structure.
 | 
						|
Only later, once gained more experience adapting GNU <CODE>gettext</CODE>
 | 
						|
to flat distributions, we might add some notes about how to proceed
 | 
						|
in flat situations.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC67" HREF="gettext_toc.html#TOC67">Prerequisite Works</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
There are some works which are required for using GNU <CODE>gettext</CODE>
 | 
						|
in one of your package.  These works have some kind of generality
 | 
						|
that escape the point by point descriptions used in the remainder
 | 
						|
of this chapter.  So, we describe them here.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<UL>
 | 
						|
<LI>
 | 
						|
 | 
						|
Before attempting to use you should install some other packages first.
 | 
						|
Ensure that recent versions of GNU <CODE>m4</CODE>, GNU Autoconf and GNU
 | 
						|
<CODE>gettext</CODE> are already installed at your site, and if not, proceed
 | 
						|
to do this first.  If you got to install these things, beware that
 | 
						|
GNU <CODE>m4</CODE> must be fully installed before GNU Autoconf is even
 | 
						|
<EM>configured</EM>.
 | 
						|
 | 
						|
Those three packages are only needed to you, as a maintainer; the
 | 
						|
installers of your own package and end users do not really need any
 | 
						|
of GNU <CODE>m4</CODE>, GNU Autoconf or GNU <CODE>gettext</CODE> for successfully
 | 
						|
installing and running your package, with messages properly translated.
 | 
						|
But this is not completely true if you provide internationalized
 | 
						|
shell scripts within your own package: GNU <CODE>gettext</CODE> shall
 | 
						|
then be installed at the user site if the end users want to see the
 | 
						|
translation of shell script messages.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Your package should use Autoconf and have a <TT>`configure.in'</TT> file.
 | 
						|
If it does not, you have to learn how.  The Autoconf documentation
 | 
						|
is quite well written, it is a good idea that you print it and get
 | 
						|
familiar with it.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Your C sources should have already been modified according to
 | 
						|
instructions given earlier in this manual.  See section <A HREF="gettext.html#SEC13">Preparing Program Sources</A>.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Your <TT>`po/'</TT> directory should receive all PO files submitted to you
 | 
						|
by the translator teams, each having <TT>`<VAR>ll</VAR>.po'</TT> as a name.
 | 
						|
This is not usually easy to get translation
 | 
						|
work done before your package gets internationalized and available!
 | 
						|
Since the cycle has to start somewhere, the easiest for the maintainer
 | 
						|
is to start with absolutely no PO files, and wait until various
 | 
						|
translator teams get interested in your package, and submit PO files.
 | 
						|
 | 
						|
</UL>
 | 
						|
 | 
						|
<P>
 | 
						|
It is worth adding here a few words about how the maintainer should
 | 
						|
ideally behave with PO files submissions.  As a maintainer, your
 | 
						|
role is to authentify the origin of the submission as being the
 | 
						|
representative of the appropriate GNU translating team (forward the
 | 
						|
submission to <TT>`gnu-translation@prep.ai.mit.edu'</TT> in case of
 | 
						|
doubt), to ensure that the PO file format is not severely broken and
 | 
						|
does not prevent successful installation, and for the rest, to merely
 | 
						|
to put these PO files in <TT>`po/'</TT> for distribution.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
As a maintainer, you do not have to take on your shoulders the
 | 
						|
responsibility of checking if the translations are adequate or
 | 
						|
complete, and should avoid diving into linguistic matters.  Translation
 | 
						|
teams drive themselves and are fully responsible of their linguistic
 | 
						|
choices for GNU.  Keep in mind that translator teams are <EM>not</EM>
 | 
						|
driven by maintainers.  You can help by carefully redirecting all
 | 
						|
communications and reports from users about linguistic matters to the
 | 
						|
appropriate translation team, or explain users how to reach or join
 | 
						|
their team.  The simplest might be to send them the <TT>`NLS'</TT> file.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Maintainers should <EM>never ever</EM> apply PO file bug reports
 | 
						|
themselves, short-cutting translation teams.  If some translator has
 | 
						|
difficulty to get some of her points through her team, it should not be
 | 
						|
an issue for her to directly negotiate translations with maintainers.
 | 
						|
Teams ought to settle their problems themselves, if any.  If you, as
 | 
						|
a maintainer, ever think there is a real problem with a team, please
 | 
						|
never try to <EM>solve</EM> a team's problem on your own.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC68" HREF="gettext_toc.html#TOC68">Invoking the <CODE>gettextize</CODE> Program</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Some files are consistently and identically needed in every package
 | 
						|
internationalized through GNU <CODE>gettext</CODE>.  As a matter of
 | 
						|
convenience, the <CODE>gettextize</CODE> program puts all these files right
 | 
						|
in your package.  This program has the following synopsis:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
gettextize [ <VAR>option</VAR>... ] [ <VAR>directory</VAR> ]
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
and accepts the following options:
 | 
						|
 | 
						|
</P>
 | 
						|
<DL COMPACT>
 | 
						|
 | 
						|
<DT><SAMP>`-f'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--force'</SAMP>
 | 
						|
<DD>
 | 
						|
Force replacement of files which already exist.
 | 
						|
 | 
						|
<DT><SAMP>`-h'</SAMP>
 | 
						|
<DD>
 | 
						|
<DT><SAMP>`--help'</SAMP>
 | 
						|
<DD>
 | 
						|
Display this help and exit.
 | 
						|
 | 
						|
<DT><SAMP>`--version'</SAMP>
 | 
						|
<DD>
 | 
						|
Output version information and exit.
 | 
						|
 | 
						|
</DL>
 | 
						|
 | 
						|
<P>
 | 
						|
If <VAR>directory</VAR> is given, this is the top level directory of a
 | 
						|
package to prepare for using GNU <CODE>gettext</CODE>.  If not given, it
 | 
						|
is assumed that the current directory is the top level directory of
 | 
						|
such a package.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
The program <CODE>gettextize</CODE> provides the following files.  However,
 | 
						|
no existing file will be replaced unless the option <CODE>--force</CODE>
 | 
						|
(<CODE>-f</CODE>) is specified.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<OL>
 | 
						|
<LI>
 | 
						|
 | 
						|
The <TT>`NLS'</TT> file is copied in the main directory of your package,
 | 
						|
the one being at the top level.  This file gives the main indications
 | 
						|
about how to install and use the Native Language Support features
 | 
						|
of your program.  You might elect to use a more recent copy of this
 | 
						|
<TT>`NLS'</TT> file than the one provided through <CODE>gettextize</CODE>, if
 | 
						|
you have one handy.  You may also fetch a more recent copy of file
 | 
						|
<TT>`NLS'</TT> from most GNU archive sites.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
A <TT>`po/'</TT> directory is created for eventually holding
 | 
						|
all translation files, but initially only containing the file
 | 
						|
<TT>`po/Makefile.in.in'</TT> from the GNU <CODE>gettext</CODE> distribution.
 | 
						|
(beware the double <SAMP>`.in'</SAMP> in the file name). If the <TT>`po/'</TT>
 | 
						|
directory already exists, it will be preserved along with the files
 | 
						|
it contains, and only <TT>`Makefile.in.in'</TT> will be overwritten.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
A <TT>`intl/'</TT> directory is created and filled with most of the files
 | 
						|
originally in the <TT>`intl/'</TT> directory of the GNU <CODE>gettext</CODE>
 | 
						|
distribution.  Also, if option <CODE>--force</CODE> (<CODE>-f</CODE>) is given,
 | 
						|
the <TT>`intl/'</TT> directory is emptied first.
 | 
						|
 | 
						|
</OL>
 | 
						|
 | 
						|
<P>
 | 
						|
If your site support symbolic links, <CODE>gettextize</CODE> will not
 | 
						|
actually copy the files into your package, but establish symbolic
 | 
						|
links instead.  This avoids duplicating the disk space needed in
 | 
						|
all packages.  Merely using the <SAMP>`-h'</SAMP> option while creating the
 | 
						|
<CODE>tar</CODE> archive of your distribution will resolve each link by an
 | 
						|
actual copy in the distribution archive.  So, to insist, you really
 | 
						|
should use <SAMP>`-h'</SAMP> option with <CODE>tar</CODE> within your <CODE>dist</CODE>
 | 
						|
goal of your main <TT>`Makefile.in'</TT>.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
It is interesting to understand that most new files for supporting
 | 
						|
GNU <CODE>gettext</CODE> facilities in one package go in <TT>`intl/'</TT>
 | 
						|
and <TT>`po/'</TT> subdirectories.  One distinction between these two
 | 
						|
directories is that <TT>`intl/'</TT> is meant to be completely identical
 | 
						|
in all packages using GNU <CODE>gettext</CODE>, while all newly created
 | 
						|
files, which have to be different, go into <TT>`po/'</TT>.  There is a
 | 
						|
common <TT>`Makefile.in.in'</TT> in <TT>`po/'</TT>, because the <TT>`po/'</TT>
 | 
						|
directory needs its own <TT>`Makefile'</TT>, and it has been designed so
 | 
						|
it can be identical in all packages.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC69" HREF="gettext_toc.html#TOC69">Files You Must Create or Alter</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Besides files which are automatically added through <CODE>gettextize</CODE>,
 | 
						|
there are many files needing revision for properly interacting with
 | 
						|
GNU <CODE>gettext</CODE>.  If you are closely following GNU standards for
 | 
						|
Makefile engineering and auto-configuration, the adaptations should
 | 
						|
be easier to achieve.  Here is a point by point description of the
 | 
						|
changes needed in each.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
So, here comes a list of files, each one followed by a description of
 | 
						|
all alterations it needs.  Many examples are taken out from the GNU
 | 
						|
<CODE>gettext</CODE> 0.10 distribution itself.  You may indeed
 | 
						|
refer to the source code of the GNU <CODE>gettext</CODE> package, as it
 | 
						|
is intended to be a good example and master implementation for using
 | 
						|
its own functionality.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC70" HREF="gettext_toc.html#TOC70"><TT>`POTFILES'</TT> in <TT>`po/'</TT></A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
The <TT>`po/'</TT> directory should receive a file named
 | 
						|
<TT>`POTFILES.in'</TT>.  This file tells which files, among all program
 | 
						|
sources, have marked strings needing translation.  Here is an example
 | 
						|
of such a file:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<PRE>
 | 
						|
# List of source files containing translatable strings.
 | 
						|
# Copyright (C) 1995 Free Software Foundation, Inc.
 | 
						|
 | 
						|
# Common library files
 | 
						|
lib/error.c
 | 
						|
lib/getopt.c
 | 
						|
lib/xmalloc.c
 | 
						|
 | 
						|
# Package source files
 | 
						|
src/gettextp.c
 | 
						|
src/msgfmt.c
 | 
						|
src/xgettext.c
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Dashed comments and white lines are ignored.  All other lines
 | 
						|
list those source files containing strings marked for translation
 | 
						|
(see section <A HREF="gettext.html#SEC15">How Marks Appears in Sources</A>), in a notation relative to the top level
 | 
						|
of your whole distribution, rather than the location of the
 | 
						|
<TT>`POTFILES.in'</TT> file itself.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC71" HREF="gettext_toc.html#TOC71"><TT>`configure.in'</TT> at top level</A></H3>
 | 
						|
 | 
						|
 | 
						|
<OL>
 | 
						|
<LI>Declare the package and version.
 | 
						|
 | 
						|
This is done by a set of lines like these:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
PACKAGE=gettext
 | 
						|
VERSION=0.10
 | 
						|
AC_DEFINE_UNQUOTED(PACKAGE, "$PACKAGE")
 | 
						|
AC_DEFINE_UNQUOTED(VERSION, "$VERSION")
 | 
						|
AC_SUBST(PACKAGE)
 | 
						|
AC_SUBST(VERSION)
 | 
						|
</PRE>
 | 
						|
 | 
						|
Of course, you replace <SAMP>`gettext'</SAMP> with the name of your package,
 | 
						|
and <SAMP>`0.10'</SAMP> by its version numbers, exactly as they
 | 
						|
should appear in the packaged <CODE>tar</CODE> file name of your distribution
 | 
						|
(<TT>`gettext-0.10.tar.gz'</TT>, here).
 | 
						|
 | 
						|
<LI>Declare the available translations.
 | 
						|
 | 
						|
This is done by defining <CODE>ALL_LINGUAS</CODE> to the white separated,
 | 
						|
quoted list of available languages, in a single line, like this:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
ALL_LINGUAS="de fr"
 | 
						|
</PRE>
 | 
						|
 | 
						|
This example means that German and French PO files are available, so
 | 
						|
that these languages are currently supported by your package.  If you
 | 
						|
want to further restrict, at installation time, the set of installed
 | 
						|
languages, this should not be done by modifying <CODE>ALL_LINGUAS</CODE> in
 | 
						|
<TT>`configure.in'</TT>, but rather by using the <CODE>LINGUAS</CODE> environment
 | 
						|
variable (see section <A HREF="gettext.html#SEC34">Magic for Installers</A>).
 | 
						|
 | 
						|
<LI>Check for internationalization support.
 | 
						|
 | 
						|
Here is the main <CODE>m4</CODE> macro for triggering internationalization
 | 
						|
support.  Just add this line to <TT>`configure.in'</TT>:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
ud_GNU_GETTEXT
 | 
						|
</PRE>
 | 
						|
 | 
						|
This call is purposely simple, even if it generates a lot of configure
 | 
						|
time checking and actions.
 | 
						|
 | 
						|
<LI>Obtain some <TT>`libintl.h'</TT> header file.
 | 
						|
 | 
						|
Once you called <CODE>ud_GNU_GETTEXT</CODE> in <TT>`configure.in'</TT>, use:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
AC_LINK_FILES($nls_cv_header_libgt, $nls_cv_header_intl)
 | 
						|
</PRE>
 | 
						|
 | 
						|
This will create one header file <TT>`libintl.h'</TT>.  The reason for
 | 
						|
this has to do with the fact that some systems, using the Uniforum
 | 
						|
message handling functions, already have a file of this name.
 | 
						|
 | 
						|
The <CODE>AC_LINK_FILES</CODE> call has not been integrated into the
 | 
						|
<CODE>ud_GNU_GETTEXT</CODE> macro because there can be only one such call
 | 
						|
in a <TT>`configure'</TT> file.  If you already use it, you will have to
 | 
						|
<EM>merge</EM> the needed <CODE>AC_LINK_FILES</CODE> within yours, by adding
 | 
						|
the first argument at the end of the list of your first argument,
 | 
						|
and adding the second argument at the end of the list of your second
 | 
						|
argument.
 | 
						|
 | 
						|
<LI>Have output files created.
 | 
						|
 | 
						|
The <CODE>AC_OUTPUT</CODE> directive, at the end of your <TT>`configure.in'</TT>
 | 
						|
file, needs to be modified in two ways:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
AC_OUTPUT([<VAR>existing configuration files</VAR> intl/Makefile po/Makefile.in],
 | 
						|
[sed -e "/POTFILES =/r po/POTFILES" po/Makefile.in > po/Makefile
 | 
						|
<VAR>existing additional actions</VAR>])
 | 
						|
</PRE>
 | 
						|
 | 
						|
The modification to the first argument to <CODE>AC_OUTPUT</CODE> asks
 | 
						|
for substitution in the <TT>`intl/'</TT> and <TT>`po/'</TT> directories.
 | 
						|
Note the <SAMP>`.in'</SAMP> suffix used for <TT>`po/'</TT> only.  This is because
 | 
						|
the distributed file is really <TT>`po/Makefile.in.in'</TT>.
 | 
						|
 | 
						|
The modification to the second argument ensures that <TT>`po/Makefile'</TT>
 | 
						|
gets generated out of the <TT>`po/Makefile.in'</TT> just created, including
 | 
						|
in it the <TT>`po/POTFILES'</TT> produced by <CODE>ud_GNU_GETTEXT</CODE>.
 | 
						|
Two steps are needed because <TT>`po/POTFILES'</TT> can get lengthy in
 | 
						|
some packages, too lengthy in fact for being able to merely use an
 | 
						|
Autoconf substituted variable, as many <CODE>sed</CODE>s cannot handle very
 | 
						|
long lines.
 | 
						|
 | 
						|
</OL>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC72" HREF="gettext_toc.html#TOC72"><TT>`aclocal.m4'</TT> at top level</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
If you do not have an <TT>`aclocal.m4'</TT> file in your distribution,
 | 
						|
the simplest is taking a copy of <TT>`aclocal.m4'</TT> from
 | 
						|
GNU <CODE>gettext</CODE>.  But to be precise, you only need macros
 | 
						|
<CODE>ud_LC_MESSAGES</CODE>, <CODE>ud_WITH_NLS</CODE> and <CODE>ud_GNU_GETTEXT</CODE>,
 | 
						|
so you may use an editor and remove macros you do not need.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
If you already have an <TT>`aclocal.m4'</TT> file, then you will have
 | 
						|
to merge the said macros into your <TT>`aclocal.m4'</TT>.  Note that if
 | 
						|
you are upgrading from a previous release of GNU <CODE>gettext</CODE>, you
 | 
						|
should most probably <EM>replace</EM> the said macros, as they usually
 | 
						|
change a little from one release of GNU <CODE>gettext</CODE> to the next.
 | 
						|
Their contents may vary as we get more experience with strange systems
 | 
						|
out there.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
These macros check for the internationalization support functions
 | 
						|
and related informations.  Hopefully, once stabilized, these macros
 | 
						|
might be integrated in the standard Autoconf set, because this
 | 
						|
piece of <CODE>m4</CODE> code will be the same for all projects using GNU
 | 
						|
<CODE>gettext</CODE>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC73" HREF="gettext_toc.html#TOC73"><TT>`acconfig.h'</TT> at top level</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
If you do not have an <TT>`acconfig.h'</TT> file in your distribution,
 | 
						|
the simplest is use take a copy of <TT>`acconfig.h'</TT> from
 | 
						|
GNU <CODE>gettext</CODE>.  But to be precise, you only need the
 | 
						|
lines and comments for <CODE>ENABLE_NLS</CODE>, <CODE>HAVE_CATGETS</CODE>,
 | 
						|
<CODE>HAVE_GETTEXT</CODE> and <CODE>HAVE_LC_MESSAGES</CODE>, so you may use
 | 
						|
an editor and remove everything else.  If you already have an
 | 
						|
<TT>`acconfig.h'</TT> file, then you should merge the said definitions
 | 
						|
into your <TT>`acconfig.h'</TT>.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC74" HREF="gettext_toc.html#TOC74"><TT>`Makefile.in'</TT> at top level</A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
Here are a few modifications you need to make to your main, top-level
 | 
						|
<TT>`Makefile.in'</TT> file.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<OL>
 | 
						|
<LI>
 | 
						|
 | 
						|
Add the following lines near the beginning of your <TT>`Makefile.in'</TT>,
 | 
						|
so the <SAMP>`dist:'</SAMP> goal will work properly (as explained further down):
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
PACKAGE = @PACKAGE@
 | 
						|
VERSION = @VERSION@
 | 
						|
</PRE>
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Add file <TT>`NLS'</TT> to the <CODE>DISTFILES</CODE> definition, so the file gets
 | 
						|
distributed.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Wherever you process subdirectories in your <TT>`Makefile.in'</TT>, be
 | 
						|
sure you also process <CODE>@INTLSUB@</CODE> and <CODE>@POSUB@</CODE>, which
 | 
						|
are replaced respectively by <SAMP>`intl'</SAMP> and <SAMP>`po'</SAMP>, or empty
 | 
						|
when the configuration processes decides these directories should
 | 
						|
not be processed.
 | 
						|
 | 
						|
Here is an example of a canonical order of processing.  In this
 | 
						|
example, we also define <CODE>SUBDIRS</CODE> in <CODE>Makefile.in</CODE> for it
 | 
						|
to be further used in the <SAMP>`dist:'</SAMP> goal.
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
SUBDIRS = doc lib @INTLSUB@ src @POSUB@
 | 
						|
</PRE>
 | 
						|
 | 
						|
that you will have to adapt to your own package.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
A delicate point is the <SAMP>`dist:'</SAMP> goal, as both
 | 
						|
<TT>`intl/Makefile'</TT> and <TT>`po/Makefile'</TT> will later assume that the
 | 
						|
proper directory has been set up from the main <TT>`Makefile'</TT>.  Here is
 | 
						|
an example at what the <SAMP>`dist:'</SAMP> goal might look like:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
distdir = $(PACKAGE)-$(VERSION)
 | 
						|
dist: Makefile
 | 
						|
	rm -fr $(distdir)
 | 
						|
	mkdir $(distdir)
 | 
						|
	chmod 777 $(distdir)
 | 
						|
	for file in $(DISTFILES); do \
 | 
						|
	  ln $$file $(distdir) 2>/dev/null || cp -p $$file $(distdir); \
 | 
						|
	done
 | 
						|
	for subdir in $(SUBDIRS); do \
 | 
						|
	  mkdir $(distdir)/$$subdir || exit 1; \
 | 
						|
	  chmod 777 $(distdir)/$$subdir; \
 | 
						|
	  (cd $$subdir && $(MAKE) $@) || exit 1; \
 | 
						|
	done
 | 
						|
	tar chozf $(distdir).tar.gz $(distdir)
 | 
						|
	rm -fr $(distdir)
 | 
						|
</PRE>
 | 
						|
 | 
						|
</OL>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H3><A NAME="SEC75" HREF="gettext_toc.html#TOC75"><TT>`Makefile.in'</TT> in <TT>`src/'</TT></A></H3>
 | 
						|
 | 
						|
<P>
 | 
						|
Some of the modifications made in the main <TT>`Makefile.in'</TT> will
 | 
						|
also be needed in the <TT>`Makefile.in'</TT> from your package sources,
 | 
						|
which we assume here to be in the <TT>`src/'</TT> subdirectory.  Here are
 | 
						|
all the modifications needed in <TT>`src/Makefile.in'</TT>:
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
<OL>
 | 
						|
<LI>
 | 
						|
 | 
						|
In view of the <SAMP>`dist:'</SAMP> goal, you should have these lines near the
 | 
						|
beginning of <TT>`src/Makefile.in'</TT>:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
PACKAGE = @PACKAGE@
 | 
						|
VERSION = @VERSION@
 | 
						|
</PRE>
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
If not done already, you should guarantee that <CODE>top_srcdir</CODE>
 | 
						|
gets defined.  This will serve for <CODE>cpp</CODE> include files.  Just add
 | 
						|
the line:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
top_srcdir = @top_srcdir@
 | 
						|
</PRE>
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
You might also want to define <CODE>subdir</CODE> as <SAMP>`src'</SAMP>, later
 | 
						|
allowing for almost uniform <SAMP>`dist:'</SAMP> goals in all your
 | 
						|
<TT>`Makefile.in'</TT>.  At list, the <SAMP>`dist:'</SAMP> goal below assume that
 | 
						|
you used:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
subdir = src
 | 
						|
</PRE>
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
You should ensure that the final linking will use <CODE>@INTLLIBS@</CODE> as
 | 
						|
a library.  An easy way to achieve this is to manage that it gets into
 | 
						|
<CODE>LIBS</CODE>, like this:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
LIBS = @INTLLIBS@ @LIBS@
 | 
						|
</PRE>
 | 
						|
 | 
						|
In most GNU packages one will find a directory <TT>`lib/'</TT> in which a
 | 
						|
library containing some helper functions will be build.  (You need at
 | 
						|
least the few functions which the GNU <CODE>gettext</CODE> Library itself
 | 
						|
needs.)  However some of the functions in the <TT>`lib/'</TT> also give
 | 
						|
messages to the user which of course should be translated, too.  Taking
 | 
						|
care of this it is not enough to place the support library (say
 | 
						|
<TT>`libsupport.a'</TT>) just between the <CODE>@INTLLIBS@</CODE> and
 | 
						|
<CODE>@LIBS@</CODE> in the above example.  Instead one has to write this:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
LIBS = ../lib/libsupport.a @INTLLIBS@ ../lib/libsupport.a @LIBS@
 | 
						|
</PRE>
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
You should also ensure that directory <TT>`intl/'</TT> will be searched for
 | 
						|
C preprocessor include files in all circumstances.  So, you have to
 | 
						|
manage so both <SAMP>`-I../intl'</SAMP> and <SAMP>`-I$(top_srcdir)/intl'</SAMP> will
 | 
						|
be given to the C compiler.
 | 
						|
 | 
						|
<LI>
 | 
						|
 | 
						|
Your <SAMP>`dist:'</SAMP> goal has to conform with others.  Here is a
 | 
						|
reasonable definition for it:
 | 
						|
 | 
						|
 | 
						|
<PRE>
 | 
						|
distdir = ../$(PACKAGE)-$(VERSION)/$(subdir)
 | 
						|
dist: Makefile $(DISTFILES)
 | 
						|
	for file in $(DISTFILES); do \
 | 
						|
	  ln $$file $(distdir) 2>/dev/null || cp -p $$file $(distdir); \
 | 
						|
	done
 | 
						|
</PRE>
 | 
						|
 | 
						|
</OL>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H1><A NAME="SEC76" HREF="gettext_toc.html#TOC76">Concluding Remarks</A></H1>
 | 
						|
 | 
						|
<P>
 | 
						|
We would like to conclude this GNU <CODE>gettext</CODE> manual by presenting
 | 
						|
an history of the GNU Translation Project so far.  We finally give
 | 
						|
a few pointers for those who want to do further research or readings
 | 
						|
about Native Language Support matters.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC77" HREF="gettext_toc.html#TOC77">History of GNU <CODE>gettext</CODE></A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Internationalization concerns and algorithms have been informally
 | 
						|
and casually discussed for years in GNU, sometimes around GNU
 | 
						|
<CODE>libc</CODE>, maybe around the incoming <CODE>Hurd</CODE>, or otherwise
 | 
						|
(nobody clearly remembers).  And even then, when the work started for
 | 
						|
real, this was somewhat independently of these previous discussions.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
This all began in July 1994, when Patrick D'Cruze had the idea and
 | 
						|
initiative of internationalizing version 3.9.2 of GNU <CODE>fileutils</CODE>.
 | 
						|
He then asked Jim Meyering, the maintainer, how to get those changes
 | 
						|
folded into an official release.  That first draft was full of
 | 
						|
<CODE>#ifdef</CODE>s and somewhat disconcerting, and Jim wanted to find
 | 
						|
nicer ways.  Patrick and Jim shared some tries and experimentations
 | 
						|
in this area.  Then, feeling that this might eventually have a deeper
 | 
						|
impact on GNU, Jim wanted to know what standards were, and contacted
 | 
						|
Richard Stallman, who very quickly and verbally described an overall
 | 
						|
design for what was meant to become <CODE>glocale</CODE>, at that time.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Jim implemented <CODE>glocale</CODE> and got a lot of exhausting feedback
 | 
						|
from Patrick and Richard, of course, but also from Mitchum DSouza
 | 
						|
(who wrote a <CODE>catgets</CODE>-like package), Roland McGrath, maybe David
 | 
						|
MacKenzie,  Pinard, and Paul Eggert, all pushing and
 | 
						|
pulling in various directions, not always compatible, to the extent
 | 
						|
that after a couple of test releases, <CODE>glocale</CODE> was torn apart.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
While Jim took some distance and time and became dad for a second
 | 
						|
time, Roland wanted to get GNU <CODE>libc</CODE> internationalized, and
 | 
						|
got Ulrich Drepper involved in that project.  Instead of starting
 | 
						|
from <CODE>glocale</CODE>, Ulrich rewrote something from scratch, but
 | 
						|
more conformant to the set of guidelines who emerged out of the
 | 
						|
<CODE>glocale</CODE> effort.  Then, Ulrich got people from the previous
 | 
						|
forum to involve themselves into this new project, and the switch
 | 
						|
from <CODE>glocale</CODE> to what was first named <CODE>msgutils</CODE>, renamed
 | 
						|
<CODE>nlsutils</CODE>, and later <CODE>gettext</CODE>, became officially accepted
 | 
						|
by Richard in May 1995 or so.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
Let's summarize by saying that Ulrich Drepper wrote GNU <CODE>gettext</CODE>
 | 
						|
in April 1995.  The first official release of the package, including
 | 
						|
PO mode, occurred in July 1995, and was numbered 0.7.  Other people
 | 
						|
contributed to the effort by providing a discussion forum around
 | 
						|
Ulrich, writing little pieces of code, or testing.  These are quoted
 | 
						|
in the <CODE>THANKS</CODE> file which comes with the GNU <CODE>gettext</CODE>
 | 
						|
distribution.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
While this was being done,  adapted half a dozen of
 | 
						|
GNU packages to <CODE>glocale</CODE> first, then later to <CODE>gettext</CODE>,
 | 
						|
putting them in pretest, so providing along the way an effective
 | 
						|
user environment for fine tuning the evolving tools.  He also took
 | 
						|
the responsibility of organizing and coordinating the GNU Translation
 | 
						|
Project.  After nearly a year of informal exchanges between people from
 | 
						|
many countries, translator teams started to exist in May 1995, through
 | 
						|
the creation and support by Patrick D'Cruze of twenty unmoderated
 | 
						|
mailing lists for that many native languages, and two moderated
 | 
						|
lists: one for reaching all teams at once, the other for reaching
 | 
						|
all maintainers of internationalized packages in GNU.
 | 
						|
 | 
						|
</P>
 | 
						|
<P>
 | 
						|
 also wrote PO mode in June 1995 with the collaboration
 | 
						|
of Greg McGary, as a kind of contribution to Ulrich's package.
 | 
						|
He also gave a hand with the GNU <CODE>gettext</CODE> Texinfo manual.
 | 
						|
 | 
						|
</P>
 | 
						|
 | 
						|
 | 
						|
<H2><A NAME="SEC78" HREF="gettext_toc.html#TOC78">Related Readings</A></H2>
 | 
						|
 | 
						|
<P>
 | 
						|
Eugene H. Dorr (<TT>`dorre@well.com'</TT>) maintains an interesting
 | 
						|
bibliography on internationalization matters, called
 | 
						|
<CITE>Internationalization Reference List</CITE>, which is available as:
 | 
						|
 | 
						|
<PRE>
 | 
						|
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/i18n-books.txt
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Michael Gschwind (<TT>`mike@vlsivie.tuwien.ac.at'</TT>) maintains a
 | 
						|
Frequently Asked Questions (FAQ) list, entitled <CITE>Programming for
 | 
						|
Internationalisation</CITE>.  This FAQ discusses writing programs which
 | 
						|
can handle different language conventions, character sets, etc.;
 | 
						|
and is applicable to all character set encodings, with particular
 | 
						|
emphasis on ISO 8859-1.  It is regularly published in Usenet
 | 
						|
groups <TT>`comp.unix.questions'</TT>, <TT>`comp.std.internat'</TT>,
 | 
						|
<TT>`comp.software.international'</TT>, <TT>`comp.lang.c'</TT>,
 | 
						|
<TT>`comp.windows.x'</TT>, <TT>`comp.std.c'</TT>, <TT>`comp.answers'</TT>
 | 
						|
and <TT>`news.answers'</TT>.  The home location of this document is:
 | 
						|
 | 
						|
<PRE>
 | 
						|
ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit/ISO-programming
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
Patrick D'Cruze (<TT>`pdcruze@li.org'</TT>) wrote a tutorial about NLS
 | 
						|
matters, and Jochen Hein (<TT>`Hein@student.tu-clausthal.de'</TT>) took
 | 
						|
over the responsibility of maintaining it.  It may be found as:
 | 
						|
 | 
						|
<PRE>
 | 
						|
ftp://sunsite.unc.edu/pub/Linux/utils/nls/catalogs/Incoming/...
 | 
						|
     ...locale-tutorial-0.8.txt.gz
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
This site is mirrored in:
 | 
						|
 | 
						|
<PRE>
 | 
						|
ftp://ftp.ibp.fr/pub/linux/sunsite/
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
A French version of the same tutorial should be findable at:
 | 
						|
 | 
						|
<PRE>
 | 
						|
ftp://ftp.ibp.fr/pub/linux/french/docs/
 | 
						|
</PRE>
 | 
						|
 | 
						|
<P>
 | 
						|
together with French translations of many Linux-related documents.
 | 
						|
 | 
						|
</P>
 | 
						|
<P><HR><P>
 | 
						|
This document was generated on 4 September 1998 using the
 | 
						|
<A HREF="http://wwwcn.cern.ch/dci/texi2html/">texi2html</A>
 | 
						|
translator version 1.51.</P>
 | 
						|
</BODY>
 | 
						|
</HTML>
 |