Instead of falling back on Latin-1 if we fail to decode the input as
UTF-8, check if we have enough bytes for the latter and just return an
error if we don't.
This ensures that wxTextInputStream::GetChar() and similar code will
retry with a longer byte sequence, allowing wxConvAuto to be used for
decoding UTF-8 contents on the fly, which didn't work before.
See #14720.
This keyword is not expanded by Git which means it's not replaced with the
correct revision value in the releases made using git-based scripts and it's
confusing to have lines with unexpanded "$Id$" in the released files. As
expanding them with Git is not that simple (it could be done with git archive
and export-subst attribute) and there are not many benefits in having them in
the first place, just remove all these lines.
If nothing else, this will make an eventual transition to Git simpler.
Closes#14487.
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@74602 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
We didn't handle the case when the length of the input buffer was not
specified correctly and wxConvAuto::DetectBOM() could read beyond the end of
input. Moreover, the unit test actually relied on this as it didn't pass the
correct length for the literal strings with embedded NULs. This somehow worked
with MSVC but failed with MinGW (see #10713).
Correct the code to handle wxNO_LEN case correctly and fix the unit test to
pass the correct lengths.
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@65739 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
wxWidgets requires wchar_t for some time now; wx/chartype.h has a check
to fail complation without it. Simplify code by removing now-dead code
for the !wxUSE_WCHAR_T case.
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63991 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
On the fly detection of the BOM was wrongly implemented for UTF-32BE in
r63064 and returned BOM_None for it if we tried to read exactly 2 bytes.
Fix this by returning BOM_Unknown if the first 2 bytes are NUL.
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63246 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
Since the changes of r63064 we could return 0 when asked to convert a
non-empty buffer containing only a BOM. This confused the logic in
wxTextInputStream::NextChar() and was generally unexpected so now return
wxCONV_FAILED in this case.
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63245 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
wxConvAuto implicitly supposed that the chunk of data passed to it for
translation was big enough to allow it to at least detect the BOM from it.
However this isn't necessarily the case and never is with wxTextInputStream
which reads the bytes one by one.
Fix this by waiting until we have enough data to be able to detect the BOM.
This still doesn't fix the problem with streams without BOM and the
corresponding unit test still fails -- it will need to be fixed at the level
of wxTextInputStream itself later but handling correctly the cases when a BOM
is present is already better than before.
See #11570.
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63064 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2. this allows to use wxConvAuto() instead of wxConvUTF8 as default value
for this parameter in the classes which read text from the file: wxConvAuto
automatically recognizes the BOM at the start of file and uses the correct
conversion
3. don't use Windows for UTF-7 conversions as there is no way to make it
fail on invalid UTF-7 strings; use our own wxMBConvUtf7 instead
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@38570 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775