Described in the comments and documented the semantics of the parameters and

return values of wxMBConv methods and tried to make them more consistent. The only (intentional) backwards incompatible change is that cMB2WC/cWC2MB now return the length of the converted string in outLen parameter and not length+1 Added wxMBConv::GetMBNul() and use it instead of supposing that all multibyte strings are always terminated with a single NUL which is wrong for UTF-16/32. Using GetMBNul(), completely rewrote cMB2WC/cWC2MB() to accept a string of the specified length, whether it is NUL-terminated or not. This means that they don't overwrite the provided buffer any more and convert the entire string in all cases. Fixed bug in wxMBConvUTF16::WC2MB() which didn't NUL-terminate the string properlyv even if there was enough space. git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@38498 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2006-04-01 12:43:03 +00:00
parent 7e0a8463eb
commit eec47cc6c4
5 changed files with 455 additions and 257 deletions
--- a/docs/latex/wx/mbconv.tex
+++ b/docs/latex/wx/mbconv.tex
@@ -1,14 +1,38 @@
-%
-% automatically generated by HelpGen from
-% ../include/wx/strconv.h at 25/Mar/00 10:20:56
-%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% Name:        mbconv.tex
+%% Purpose:     wxMBConv documentation
+%% Author:      Ove Kaaven, Vadim Zeitlin
+%% Created:     2000-03-25
+%% RCS-ID:      $Id$
+%% Copyright:   (c) 2000 Ove Kaaven
+%%              (c) 2003-2006 Vadim Zeitlin
+%% License:     wxWindows license
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+

 \section{\class{wxMBConv}}\label{wxmbconv}

 This class is the base class of a hierarchy of classes capable of converting
-text strings between multibyte (SBCS or DBCS) encodings and Unicode. It is itself
-a wrapper around the standard libc mbstowcs() and wcstombs() routines, and has
-one predefined instance, {\bf wxConvLibc}.
+text strings between multibyte (SBCS or DBCS) encodings and Unicode.
+
+In the documentation for this and related classes please notice that 
+\emph{length} of the string refers to the number of characters in the string
+not counting the terminating \NUL, if any. While the \emph{size} of the string
+is the total number of bytes in the string, including any trailing {\NUL}s.
+Thus, length of wide character string \texttt{L"foo"} is $3$ while its size can
+be either $8$ or $16$ depending on whether \texttt{wchar\_t} is $2$ bytes (as
+under Windows) or $4$ (Unix).
+
+\wxheading{Global variables}
+
+There are several predefined instances of this class:
+\begin{twocollist}
+\twocolitem{\textbf{wxConvLibc}}{Uses the standard ANSI C \texttt{mbstowcs()} and
+\texttt{wcstombs()} functions to perform the conversions; thus depends on the
+current locale.}
+\twocolitem{\textbf{wxConvFile}}{The appropriate conversion for the file names,
+depends on the system.}
+\end{twocollist}

 \wxheading{Derived from}

@@ -35,30 +59,31 @@ Constructor.

 \membersection{wxMBConv::MB2WC}\label{wxmbconvmb2wc}

-\constfunc{virtual size\_t}{MB2WC}{\param{wchar\_t *}{outputBuf}, \param{const char *}{psz}, \param{size\_t }{outputSize}}
+\constfunc{virtual size\_t}{MB2WC}{\param{wchar\_t *}{out}, \param{const char *}{in}, \param{size\_t }{outLen}}

-Converts from a string {\it psz} in multibyte encoding to Unicode putting the
-output into the buffer {\it outputBuf} of the maximum size {\it outputSize} (in wide
-characters, not bytes). If {\it outputBuf} is {\tt NULL}, only the length of the
-string which would result from the conversion is calculated and returned.
-Note that this is the length and not size, i.e. the returned value does 
-{\bf not} include the trailing NUL. But when the function is called with a
-non-{\tt NULL} {\it outputBuf}, the {\it outputSize} parameter should be the size of the buffer
-and so it {\bf should} take into account the trailing NUL.
+Converts from a string \arg{in} in multibyte encoding to Unicode putting up to 
+\arg{outLen} characters into the buffer \arg{out}.
+
+If \arg{out} is \NULL, only the length of the string which would result from
+the conversion is calculated and returned. Note that this is the length and not
+size, i.e. the returned value does \emph{not} include the trailing \NUL. But
+when the function is called with a non-\NULL \arg{out} buffer, the \arg{outLen} 
+parameter should be one more to allow to properly \NUL-terminate the string.

 \wxheading{Parameters}

-\docparam{outputBuf}{the output buffer, may be {\tt NULL} if the caller is only
+\docparam{out}{The output buffer, may be \NULL if the caller is only
 interested in the length of the resulting string}

-\docparam{psz}{the {\tt NUL}-terminated input string, cannot be {\tt NULL}}
+\docparam{in}{The \NUL-terminated input string, cannot be \NULL}

-\docparam{outputSize}{the size of the output buffer (in wide characters, {\bf including} the
-NUL) , ignored if {\it outputBuf} is {\tt NULL}}
+\docparam{outLen}{The length of the output buffer but \emph{including} 
+\NUL, ignored if \arg{out} is \NULL}

 \wxheading{Return value}

-The length of the converted string (in wide characters, {\bf excluding} the NUL)
+The length of the converted string \emph{excluding} the trailing {\NUL}.
+

 \membersection{wxMBConv::WC2MB}\label{wxmbconvwc2mb}

@@ -68,25 +93,54 @@ Converts from Unicode to multibyte encoding. The semantics of this function
 (including the return value meaning) is the same as for 
 \helpref{MB2WC}{wxmbconvmb2wc}.

-Notice that when the function is called with a non-{\tt NULL} buffer, the 
-{\it n} parameter should be the size of the buffer and so it {\bf should} take
+Notice that when the function is called with a non-\NULL buffer, the 
+{\it n} parameter should be the size of the buffer and so it \emph{should} take
 into account the trailing NUL, which might take two or four bytes for some
-encodings (UTF-16 and UTF-32).
+encodings (UTF-16 and UTF-32) and not one.
+

 \membersection{wxMBConv::cMB2WC}\label{wxmbconvcmb2wc}

-\constfunc{const wxWCharBuffer}{cMB2WC}{\param{const char* }{psz}}
+\constfunc{const wxWCharBuffer}{cMB2WC}{\param{const char *}{in}}
+
+\constfunc{const wxWCharBuffer}{cMB2WC}{\param{const char *}{in}, \param{size\_t }{inLen}, \param{size\_t }{*outLen}}
+
+Converts from multibyte encoding to Unicode by calling 
+\helpref{MB2WC}{wxmbconvmb2wc}, allocating a temporary wxWCharBuffer to hold
+the result.
+
+The first overload takes a \NUL-terminated input string. The second one takes a
+string of exactly the specified length and the string may include or not the
+trailing {\NUL}s. If the string is not \NUL-terminated, a temporary 
+\NUL-terminated copy of it suitable for passing to \helpref{MB2WC}{wxmbconvmb2wc} 
+is made, so it is more efficient to ensure that the string is does have the
+appropriate number of \NUL bytes (which is usually $1$ but may be $2$ or $4$
+for UTF-16 or UTF-32), especially for long strings.
+
+If \arg{outLen} is not-\NULL, it receives the length of the converted
+string.

-Converts from multibyte encoding to Unicode by calling MB2WC,
-allocating a temporary wxWCharBuffer to hold the result.

 \membersection{wxMBConv::cWC2MB}\label{wxmbconvcwc2mb}

-\constfunc{const wxCharBuffer}{cWC2MB}{\param{const wchar\_t* }{psz}}
+\constfunc{const wxCharBuffer}{cWC2MB}{\param{const wchar\_t* }{in}}
+
+\constfunc{const wxCharBuffer}{cWC2MB}{\param{const wchar\_t* }{in}, \param{size\_t }{inLen}, \param{size\_t }{*outLen}}

 Converts from Unicode to multibyte encoding by calling WC2MB,
 allocating a temporary wxCharBuffer to hold the result.

+The second overload of this function allows to convert a string of the given
+length \arg{inLen}, whether it is \NUL-terminated or not (for wide character
+strings, unlike for the multibyte ones, a single \NUL is always enough).
+But notice that just as with \helpref{cMB2WC}{wxmbconvmb2wc}, it is more
+efficient to pass an already terminated string to this function as otherwise a
+copy is made internally.
+
+If \arg{outLen} is not-\NULL, it receives the length of the converted
+string.
+
+
 \membersection{wxMBConv::cMB2WX}\label{wxmbconvcmb2wx}

 \constfunc{const char*}{cMB2WX}{\param{const char* }{psz}}
@@ -99,6 +153,7 @@ it returns the parameter unaltered. If wxChar is wchar\_t, it returns the
 result in a wxWCharBuffer. The macro wxMB2WXbuf is defined as the correct
 return type (without const).

+
 \membersection{wxMBConv::cWX2MB}\label{wxmbconvcwx2mb}

 \constfunc{const char*}{cWX2MB}{\param{const wxChar* }{psz}}
@@ -110,6 +165,7 @@ it returns the parameter unaltered. If wxChar is wchar\_t, it returns the
 result in a wxCharBuffer. The macro wxWX2MBbuf is defined as the correct
 return type (without const).

+
 \membersection{wxMBConv::cWC2WX}\label{wxmbconvcwc2wx}

 \constfunc{const wchar\_t*}{cWC2WX}{\param{const wchar\_t* }{psz}}
@@ -121,6 +177,7 @@ it returns the parameter unaltered. If wxChar is char, it returns the
 result in a wxCharBuffer. The macro wxWC2WXbuf is defined as the correct
 return type (without const).

+
 \membersection{wxMBConv::cWX2WC}\label{wxmbconvcwx2wc}

 \constfunc{const wchar\_t*}{cWX2WC}{\param{const wxChar* }{psz}}