Commit Graph

20 Commits

Author SHA1 Message Date
Pavel Tyunin
45adce8561 Fix wxTextInputStream for some inputs starting with nulls 2020-10-03 19:49:46 +03:00
Pavel Tyunin
b3eff48e28 Switch to fallback earlier if the input is not valid UTF-8 prefix 2020-10-03 19:49:46 +03:00
Pavel Tyunin
bc838b4773 Do not delete and create the fallback conversion again when it fails 2020-10-03 19:49:46 +03:00
Pavel Tyunin
28823424e9 Add wxConvAuto::GetEncoding() 2020-10-03 19:10:17 +03:00
Vadim Zeitlin
04689e9727 Merge branch 'utf8-text-stream'
Really fix reading from UTF-8 text streams.

Closes #14720.

See https://github.com/wxWidgets/wxWidgets/pull/1304
2019-05-10 01:35:55 +02:00
Vadim Zeitlin
5488a1438f Globally replace vadim@wxwindows.org with vadim@wxwidgets.org
The old email address is invalid since many years and shouldn't be used
any longer.

No real changes.
2019-04-22 14:12:05 +02:00
Vadim Zeitlin
731b3a804f Return error from wxConvAuto if not given enough bytes
Instead of falling back on Latin-1 if we fail to decode the input as
UTF-8, check if we have enough bytes for the latter and just return an
error if we don't.

This ensures that wxTextInputStream::GetChar() and similar code will
retry with a longer byte sequence, allowing wxConvAuto to be used for
decoding UTF-8 contents on the fly, which didn't work before.

See #14720.
2019-04-21 20:05:37 +02:00
Vadim Zeitlin
3f66f6a5b3 Remove all lines containing cvs/svn "$Id$" keyword.
This keyword is not expanded by Git which means it's not replaced with the
correct revision value in the releases made using git-based scripts and it's
confusing to have lines with unexpanded "$Id$" in the released files. As
expanding them with Git is not that simple (it could be done with git archive
and export-subst attribute) and there are not many benefits in having them in
the first place, just remove all these lines.

If nothing else, this will make an eventual transition to Git simpler.

Closes #14487.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@74602 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2013-07-26 16:02:46 +00:00
Vadim Zeitlin
64b91e2d40 Add wxConvAuto::GetBOMChars() helper.
Closes #13620.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@69675 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2011-11-05 11:23:41 +00:00
Vadim Zeitlin
038809c2f6 Make BOM-detection code in wxConvAuto public.
Export GetBOM() and DetectBOM() functions.

Also rename BOMType enum elements to use "wx" prefix now that they're public.

Closes #13599.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@69571 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2011-10-27 22:48:54 +00:00
Paul Cornett
dc771347d0 remove unneeded #includes
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@66657 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2011-01-08 18:05:33 +00:00
Vadim Zeitlin
9334ad1727 Correct wxConvAuto::ToWChar() behaviour with wxNO_LEN input size.
We didn't handle the case when the length of the input buffer was not
specified correctly and wxConvAuto::DetectBOM() could read beyond the end of
input. Moreover, the unit test actually relied on this as it didn't pass the
correct length for the literal strings with embedded NULs. This somehow worked
with MSVC but failed with MinGW (see #10713).

Correct the code to handle wxNO_LEN case correctly and fix the unit test to
pass the correct lengths.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@65739 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2010-10-03 17:15:18 +00:00
Václav Slavík
8d94819c43 Remove wxUSE_WCHAR_T checks.
wxWidgets requires wchar_t for some time now; wx/chartype.h has a check
to fail complation without it. Simplify code by removing now-dead code
for the !wxUSE_WCHAR_T case.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63991 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2010-04-16 10:43:18 +00:00
Vadim Zeitlin
823e82e260 Correct UTF-32BE BOM detection in wxConvAuto.
On the fly detection of the BOM was wrongly implemented for UTF-32BE in
r63064 and returned BOM_None for it if we tried to read exactly 2 bytes.

Fix this by returning BOM_Unknown if the first 2 bytes are NUL.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63246 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2010-01-24 10:13:45 +00:00
Vadim Zeitlin
4ca973967a Correct bug with returning 0 for non-empty input from wxConvAuto::ToWChar().
Since the changes of r63064 we could return 0 when asked to convert a
non-empty buffer containing only a BOM. This confused the logic in
wxTextInputStream::NextChar() and was generally unexpected so now return
wxCONV_FAILED in this case.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63245 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2010-01-24 10:13:40 +00:00
Vadim Zeitlin
4cb0e8d05c Fix wxConvAuto behaviour when it is used by wxTextInputStream.
wxConvAuto implicitly supposed that the chunk of data passed to it for
translation was big enough to allow it to at least detect the BOM from it.
However this isn't necessarily the case and never is with wxTextInputStream
which reads the bytes one by one.

Fix this by waiting until we have enough data to be able to detect the BOM.
This still doesn't fix the problem with streams without BOM and the
corresponding unit test still fails -- it will need to be fixed at the level
of wxTextInputStream itself later but handling correctly the cases when a BOM
is present is already better than before.

See #11570.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@63064 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2010-01-04 12:22:49 +00:00
Vadim Zeitlin
9a83f86094 Globally replace _T() with wxT().
Standardize on using a single macro across all wxWidgets sources and solve the name clash with Sun CC standard headers (see #10660).

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@61508 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2009-07-23 20:30:22 +00:00
Vadim Zeitlin
5c33522fca replace wx_{const,static,reinterpret}_cast with their standard C++ equivalents
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@56644 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2008-11-02 02:39:52 +00:00
Vadim Zeitlin
01a9232b5e use fallback encoding in wxConvAuto when input is not in UTF-8
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@48463 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2007-08-30 17:54:28 +00:00
Vadim Zeitlin
830f8f11bc 1. changed all "wxMBConv& conv" parameters to "const wxMBConv&"
2. this allows to use wxConvAuto() instead of wxConvUTF8 as default value
   for this parameter in the classes which read text from the file: wxConvAuto
   automatically recognizes the BOM at the start of file and uses the correct
   conversion
3. don't use Windows for UTF-7 conversions as there is no way to make it
   fail on invalid UTF-7 strings; use our own wxMBConvUtf7 instead


git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@38570 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
2006-04-05 14:37:47 +00:00