Compare commits

...

73 Commits
2.6 ... master

Author SHA1 Message Date
37196ab4f5 Update submodule URLs 2025-07-04 12:13:05 +02:00
e96a627c55 MSICA: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2025-03-13 14:16:27 +01:00
7f2c209a06 MSI: Squash INSTALLLEVEL 1,2,3 → 1
Our default used to be INSTALLLEVEL 3, but Microsoft default is 1. By
overriding Microsoft default in Property table, we override the
INSTALLLEVEL user might have specified on the command line.

Since our installations no longer use "Minimal/Typical/Full", we may
squash "Minimal" and "Typical" into Minimal.

Signed-off-by: Simon Rozman <simon@rozman.si>
2025-03-13 14:15:36 +01:00
b1a3eb23c7 Updater: Move Git remote 2024-11-29 16:20:43 +01:00
35b8b389ff MSICA: Move Git remote 2024-11-29 16:10:15 +01:00
d8493554cd stdex: Move Git remote 2024-11-29 16:01:59 +01:00
a40e730bf0 wxExtend: Move Git remote 2024-11-29 15:57:45 +01:00
056b1c3087 MSIBuild: Move Git remote 2024-11-29 15:55:12 +01:00
7127b8ea31 WinStd: Move Git remote 2024-11-29 15:50:21 +01:00
82906899de Make mapping reusable
Signed-off-by: Simon Rozman <simon@rozman.si>
2024-04-25 15:16:15 +02:00
566d40bd05 Update submodules
Signed-off-by: Simon Rozman <simon@rozman.si>
2024-04-25 14:52:03 +02:00
6a8cd1ec80 Preset version to 2.7.1
Signed-off-by: Simon Rozman <simon@rozman.si>
2024-04-25 14:51:30 +02:00
b0db806f5e Update submodules
Signed-off-by: Simon Rozman <simon@rozman.si>
2024-03-11 15:56:28 +01:00
439dcb35d1 stdex: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-11-08 18:49:35 +01:00
ad07539cb6 Fix to compile for Linux
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-11-08 13:50:01 +01:00
a7c1481f87 Merge branch 'master' of https://github.com/Amebis/ZRCola
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-11-08 12:22:11 +01:00
c7bc2d0aa6 Update submodules
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-11-08 12:03:15 +01:00
2bbad80235 Set version to 2.7
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-11-07 19:27:30 +01:00
ffe11b17b5 Update font and database
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-11-07 19:27:30 +01:00
0daae5af37 Update submodules
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-11-07 19:27:30 +01:00
2f1f6a6c83 stdex: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-09-08 11:41:30 +02:00
f8393e3d77 MSICA: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-09-08 11:38:40 +02:00
40c4d65669 WinStd: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-09-08 11:37:37 +02:00
7d866b183b Update submodules
Signed-off-by: Simon Rozman <simon@rozman.si>
2023-03-15 22:19:56 +01:00
b13f77ce95 ZRColaWS: Stop escaping UTF-8 characters in JSON
JSON is always UTF-8 and there is absolutely no need to escape all non-ASCII
characters in output strings.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-12-14 15:18:14 +01:00
cb324389e4 ZRColaWS: Cleanup
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-12-14 15:09:40 +01:00
2de62b1636 WinStd: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-12-14 15:04:25 +01:00
afb137edee Explicitly clear reused std::vector and u16string after moved from
MSVC C26800 warned us std::vector and std::string are not guaranteed to
be cleared after being moved from in all standard C++ implementations.

As we reuse those objects and rely they are cleared, do an explicit
clear. We could have one-time-use objects and add scopes, but that makes
code ugly.

Reference: https://stackoverflow.com/a/17735913/2071884
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-10-28 13:45:51 +02:00
d4fdd62916 MSICA, WinStd: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-10-28 08:21:10 +02:00
ac63e5a957 Makefile: Move GenRSAKeypair to platform-independent place
GenRSAKeypair is platform independent. When in the MakefilePlat.mak, it
is invoked once per each platform. It was not harmful. Just excessive.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-10-06 16:00:32 +02:00
b8aa592b19 Updater: Move keypair source out of source folder
This allows us to use `git clean` without risking to loose keypair,
as it was .gitignored and not included in the Git repository.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-10-05 19:18:06 +02:00
325b9334b5 Makefile: Cleanup
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-10-05 18:51:48 +02:00
f547fbc601 ZRColaWS: Document build and install step-by-step
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-29 15:07:42 +02:00
0e2678f09e ZRColaWS: Document install
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-20 12:50:11 +02:00
e78bbc9c3b ZRColaWS: Install systemd service
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-20 12:49:25 +02:00
f523d12fa1 ZRColaWS: Set default listen port to 54591
The 8000 is Oat++ sample port.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-20 11:41:39 +02:00
c6f844775f ZRColaWS: Make logging systemd journal friendlier
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-20 11:39:30 +02:00
450c18198a ZRColaWS: Integrate Oat++ building
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-20 10:13:50 +02:00
ad57071515 ZRColaWS: Sync executable name with project
The executable is not really a Linux deamon.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-19 18:26:31 +02:00
f6d0323485 ZRColaWS: Stop setting API server in Swagger UI
Host where the webservice is listening is not the same as host where
clients connect to. Unless localhost, but that limits use of Swagger UI
for internal use only.
And, not to forget: reverse proxy that will typically run in front of
ZRCola web service and will publish it with who knows what public URL.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-19 16:11:08 +02:00
004958f464 ZRColaWS: Rearrange source and provide Swagger-UI documentation
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-19 15:49:36 +02:00
09117d68a6 ZRColaWS: Add support for gracefull exit 2022-09-19 12:45:19 +02:00
04eea84f8a ZRColaWS: Fix test HTML page
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-19 09:49:32 +02:00
685ffedb53 ZRColaWS: Simplify class and variable names
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-19 09:11:34 +02:00
f74e9930c1 ZRColaWS: Add support for inverse translation
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-16 14:54:48 +02:00
a05e62f1d3 libZRCola: Add some SAL annotations
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-16 14:53:25 +02:00
3aba608001 ZRColaWS: Return source->destination index mapping
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-16 13:54:57 +02:00
fa59e71fe3 ZRColaWS: Rename .h to .hpp and fix indents
Oat++ is using .hpp extension for header files.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-16 13:53:59 +02:00
839c6fc1e6 Linux: Remove unused code from final binary
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-16 13:03:34 +02:00
ff509ed6b5 ZRColaWS: Initial working version
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-16 03:02:16 +02:00
ca3239f0ff libZRCola: Fix typo
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-16 02:57:52 +02:00
eb0911d3c0 ZRColaWS: Split LDFLAGS and LDLIBS
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 22:59:29 +02:00
a7c7a3f40c ZRColaWS: Initial skeleton
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 18:25:43 +02:00
a6f0357ad8 libZRCola: Cleanup
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 18:17:37 +02:00
35eb472e6b Backport gcc changes to MSVC
wchar_t is not char16_t on MSVC, requiring a lot of typecasting when
interfacing ZRCola database strings with GUI.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 15:33:08 +02:00
ba4ff3cd42 libZRCola: Add test
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
9709cc2845 libZRCola: Make UTF-16 explicit
ZRCola is using UTF-16LE strings internally (thanks to Windows).
However, wchar_t and std::wstring are UTF-32 on other platforms.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
f35e49dc8b stdex: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
9a307978b5 libZRCola: Include stdex/idrec.h only after << and >> are overloaded
gcc precompiles templates.  When << and >> operators of our datatypes are
not overloaded at the time <stdex/idrec.h> is #included yet, gcc will
seek/look for currently available << and >> operators when reaching
std::ostream and std::istream templates.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
bd01e250b2 libZRCola: Add standard default "all" make target
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
8bb1049cf0 libZRCola: Resolve some warnings reported by -Wall
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
52391d9a08 libZRCola: Set CFLAGS too and enable debugging on Debug builds
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
823390d28b libZRCola: Make parts of the gcc building reusable
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
eedab7da56 libZRCola: Initial stab at compiling with gcc
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
63fda12c99 Switch integer datatypes to C99
This makes code more portable.

Signed-off-by: Simon Rozman <simon@rozman.si>
2022-09-15 14:36:44 +02:00
5cc005583c MSICA: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-05-09 14:26:16 +02:00
6b856314d0 WinStd: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-05-09 14:25:16 +02:00
b581b7a8b1 MSI: Simplify ProgramFiles(64)Folder property use
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-03-14 11:39:37 +01:00
0bfa44e6bb stdex: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-03-07 11:43:53 +01:00
5c05dc6eb6 WinStd: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-03-07 11:43:53 +01:00
7a2845fef3 MSICA: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-03-03 16:10:14 +01:00
5b71b776a7 MSIBuild: Update
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-03-02 16:04:11 +01:00
2bd3b4c3b9 MSI: Match row ID with component ID
Signed-off-by: Simon Rozman <simon@rozman.si>
2022-02-18 10:26:07 +01:00
70 changed files with 2849 additions and 1846 deletions

18
.gitmodules vendored
View File

@ -1,18 +1,24 @@
[submodule "lib/stdex"]
path = lib/stdex
url = https://github.com/Amebis/stdex.git
url = https://git.amebis.si/Amebis/stdex.git
[submodule "lib/WinStd"]
path = lib/WinStd
url = https://github.com/Amebis/WinStd.git
url = https://git.amebis.si/Amebis/WinStd.git
[submodule "lib/wxExtend"]
path = lib/wxExtend
url = https://github.com/Amebis/wxExtend.git
url = https://git.amebis.si/Amebis/wxExtend.git
[submodule "MSI/MSIBuild"]
path = MSI/MSIBuild
url = https://github.com/Amebis/MSIBuild.git
url = https://git.amebis.si/Amebis/MSIBuild.git
[submodule "MSI/MSICA"]
path = MSI/MSICA
url = https://github.com/Amebis/MSICA.git
url = https://git.amebis.si/Amebis/MSICA.git
[submodule "Updater"]
path = Updater
url = https://github.com/Amebis/Updater.git
url = https://git.amebis.si/Amebis/Updater.git
[submodule "lib/oatpp"]
path = lib/oatpp
url = https://github.com/oatpp/oatpp.git
[submodule "lib/oatpp-swagger"]
path = lib/oatpp-swagger
url = https://github.com/oatpp/oatpp-swagger.git

@ -1 +1 @@
Subproject commit 9eeff699e50ab9c837bb261ec979cb0a45eca813
Subproject commit b8364dea81f39b321d726317a9dcbf6b13a455e0

@ -1 +1 @@
Subproject commit 51e1196c19cffdad0ae38ea902dd0bf66c21a2eb
Subproject commit f989fdc827d8fa731977d1771097cf038a50e145

Binary file not shown.

BIN
Makefile

Binary file not shown.

View File

@ -1,4 +1,4 @@
# ZRCola
# ZRCola
A Microsoft Windows application for composing texts using a wide range of Slavic (and general) letters from or beyond Unicode.
@ -56,6 +56,10 @@ Use Microsoft NMAKE to build the project. The resulting files can be found in ou
The `/ls` flag can be appended to the commands above to reduce NMAKE's verbosity. You can combine multiple targets (e.g. nmake Unregister Clean). Please, see NMAKE reference for further reading.
## Building and installing ZRCola webservice
ZRCola is also available as a Linux web-service. See [ZRColaWS/README.md](ZRColaWS/README.md) for instructions.
## Translating ZRCola
Instructions how to translate ZRCola to your language can be found [here](LOCALIZATION.md).

@ -1 +1 @@
Subproject commit 76124075fce359419f5e53bbb816c0c815cd8724
Subproject commit a98ef46c5d38dec796dcb78f48240c17f743e3af

Binary file not shown.

View File

@ -46,7 +46,7 @@
#include <utility>
#include <vector>
#include <stdex/idrec>
#include <zrcola/idrec.h>
#if defined(__WXMSW__)
#include <Msi.h>

View File

@ -82,7 +82,7 @@ bool ZRColaApp::OnInit()
ZRCola::recordid_t id;
if (!stdex::idrec::read_id(dat, id, size)) break;
if (id == ZRCola::translation_rec::id) {
if (id == ZRCola::translation_rec::id()) {
dat >> ZRCola::translation_rec(m_t_db);
if (dat.good()) {
has_translation_data = true;
@ -90,61 +90,61 @@ bool ZRColaApp::OnInit()
wxFAIL_MSG(wxT("Error reading translation data from ZRCola.zrcdb."));
m_t_db.clear();
}
} else if (id == ZRCola::transet_rec::id) {
} else if (id == ZRCola::transet_rec::id()) {
dat >> ZRCola::transet_rec(m_ts_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading translation set data from ZRCola.zrcdb."));
m_ts_db.clear();
}
} else if (id == ZRCola::transeq_rec::id) {
} else if (id == ZRCola::transeq_rec::id()) {
dat >> ZRCola::transeq_rec(m_tsq_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading translation sequence data from ZRCola.zrcdb."));
m_tsq_db.clear();
}
} else if (id == ZRCola::langchar_rec::id) {
} else if (id == ZRCola::langchar_rec::id()) {
dat >> ZRCola::langchar_rec(m_lc_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading language character data from ZRCola.zrcdb."));
m_lc_db.clear();
}
} else if (id == ZRCola::language_rec::id) {
} else if (id == ZRCola::language_rec::id()) {
dat >> ZRCola::language_rec(m_lang_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading language character data from ZRCola.zrcdb."));
m_lang_db.clear();
}
} else if (id == ZRCola::keyseq_rec::id) {
} else if (id == ZRCola::keyseq_rec::id()) {
dat >> ZRCola::keyseq_rec(m_ks_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading key sequences data from ZRCola.zrcdb."));
m_ks_db.clear();
}
} else if (id == ZRCola::character_rec::id) {
} else if (id == ZRCola::character_rec::id()) {
dat >> ZRCola::character_rec(m_chr_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading character data from ZRCola.zrcdb."));
m_chr_db.clear();
}
} else if (id == ZRCola::chrcat_rec::id) {
} else if (id == ZRCola::chrcat_rec::id()) {
dat >> ZRCola::chrcat_rec(m_cc_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading character category data from ZRCola.zrcdb."));
m_cc_db.clear();
}
} else if (id == ZRCola::chrtag_rec::id) {
} else if (id == ZRCola::chrtag_rec::id()) {
dat >> ZRCola::chrtag_rec(m_ct_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading character tag data from ZRCola.zrcdb."));
m_ct_db.clear();
}
} else if (id == ZRCola::tagname_rec::id) {
} else if (id == ZRCola::tagname_rec::id()) {
dat >> ZRCola::tagname_rec(m_tn_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading tag name data from ZRCola.zrcdb."));
m_tn_db.clear();
}
} else if (id == ZRCola::highlight_rec::id) {
} else if (id == ZRCola::highlight_rec::id()) {
dat >> ZRCola::highlight_rec(m_h_db);
if (!dat.good()) {
wxFAIL_MSG(wxT("Error reading highlight data from ZRCola.zrcdb."));

View File

@ -82,7 +82,7 @@ public:
protected:
#ifdef __WXMSW__
winstd::win_handle<NULL> m_running; ///< Global Win32 event to determine if another instance of ZRCola is already running
winstd::event m_running; ///< Global Win32 event to determine if another instance of ZRCola is already running
#endif
};

View File

@ -148,11 +148,11 @@ void wxZRColaCharacterCatalogPanel::Update()
wxArrayShort(reinterpret_cast<const short*>(cg.chrshow()), reinterpret_cast<const short*>(cg.chrshow_end())));
} else {
// Select frequently used characters only.
const wchar_t *src = cg.chrlst();
const unsigned __int16 *shown = cg.chrshow();
const auto *src = cg.chrlst();
const uint16_t *shown = cg.chrshow();
wxArrayString chars;
for (size_t i = 0, i_end = cg.chrlst_len(), j = 0; i < i_end; j++) {
for (unsigned __int16 k = 0, mask = shown[j]; k < 16 && i < i_end; k++, mask >>= 1) {
for (uint16_t k = 0, mask = shown[j]; k < 16 && i < i_end; k++, mask >>= 1) {
size_t len = wcsnlen(src + i, i_end - i);
if (mask & 1)
chars.Add(wxString(src + i, len));

View File

@ -110,7 +110,7 @@ wxString wxZRColaCharGrid::GetToolTipText(int idx)
const auto &chr = m_chars[idx];
// See if this character has a key sequence registered.
std::unique_ptr<ZRCola::keyseq_db::keyseq> ks((ZRCola::keyseq_db::keyseq*)new char[sizeof(ZRCola::keyseq_db::keyseq) + sizeof(wchar_t)*chr.length()]);
std::unique_ptr<ZRCola::keyseq_db::keyseq> ks((ZRCola::keyseq_db::keyseq*)new char[sizeof(ZRCola::keyseq_db::keyseq) + sizeof(ZRCola::char_t)*chr.length()]);
ks->ZRCola::keyseq_db::keyseq::keyseq(NULL, 0, chr.data(), chr.length());
ZRCola::keyseq_db::indexKey::size_type start;
if (app->m_ks_db.idxChr.find(*ks, start)) {

View File

@ -13,7 +13,7 @@
wxIMPLEMENT_DYNAMIC_CLASS(wxZRColaUTF16CharValidator, wxValidator);
wxZRColaUTF16CharValidator::wxZRColaUTF16CharValidator(wchar_t *val) :
wxZRColaUTF16CharValidator::wxZRColaUTF16CharValidator(ZRCola::char_t *val) :
m_val(val),
wxValidator()
{
@ -58,11 +58,11 @@ bool wxZRColaUTF16CharValidator::TransferFromWindow()
}
bool wxZRColaUTF16CharValidator::Parse(const wxString &val_in, size_t i_start, size_t i_end, wxTextCtrl *ctrl, wxWindow *parent, wchar_t *val_out)
bool wxZRColaUTF16CharValidator::Parse(const wxString &val_in, size_t i_start, size_t i_end, wxTextCtrl *ctrl, wxWindow *parent, ZRCola::char_t *val_out)
{
const wxStringCharType *buf = val_in;
wchar_t chr = 0;
ZRCola::char_t chr = 0;
for (size_t i = i_start;;) {
if (i >= i_end) {
// End of Unicode found.
@ -156,7 +156,7 @@ bool wxZRColaUnicodeDumpValidator::Parse(const wxString &val_in, size_t i_start,
wxString str;
for (size_t i = i_start;;) {
const wxStringCharType *buf_next;
wchar_t chr;
ZRCola::char_t chr;
if ((buf_next = wmemchr(buf + i, L'+', i_end - i)) != NULL) {
// Unicode dump separator found.
if (!wxZRColaUTF16CharValidator::Parse(val_in, i, buf_next - buf, ctrl, parent, &chr))
@ -242,7 +242,7 @@ void wxZRColaCharSelect::OnIdle(wxIdleEvent& event)
m_gridPreview->SetCellValue(0, 0, m_char);
std::unique_ptr<ZRCola::character_db::character> ch((ZRCola::character_db::character*)new char[sizeof(ZRCola::character_db::character) + sizeof(wchar_t)*m_char.length()]);
std::unique_ptr<ZRCola::character_db::character> ch((ZRCola::character_db::character*)new char[sizeof(ZRCola::character_db::character) + sizeof(ZRCola::char_t)*m_char.length()]);
ch->ZRCola::character_db::character::character(m_char.data(), m_char.length());
ZRCola::character_db::indexChr::size_type ch_start;
if (app->m_chr_db.idxChr.find(*ch, ch_start)) {
@ -251,7 +251,7 @@ void wxZRColaCharSelect::OnIdle(wxIdleEvent& event)
m_description->SetValue(wxString(chr.desc(), chr.desc_len()));
{
// See if this character has a key sequence registered.
std::unique_ptr<ZRCola::keyseq_db::keyseq> ks((ZRCola::keyseq_db::keyseq*)new char[sizeof(ZRCola::keyseq_db::keyseq) + sizeof(wchar_t)*m_char.length()]);
std::unique_ptr<ZRCola::keyseq_db::keyseq> ks((ZRCola::keyseq_db::keyseq*)new char[sizeof(ZRCola::keyseq_db::keyseq) + sizeof(ZRCola::char_t)*m_char.length()]);
ks->ZRCola::keyseq_db::keyseq::keyseq(NULL, 0, m_char.data(), m_char.length());
ZRCola::keyseq_db::indexKey::size_type ks_start;
if (app->m_ks_db.idxChr.find(*ks, ks_start)) {
@ -284,7 +284,7 @@ void wxZRColaCharSelect::OnIdle(wxIdleEvent& event)
// Find character tags.
std::list<std::wstring> tag_names;
std::unique_ptr<ZRCola::chrtag_db::chrtag> ct((ZRCola::chrtag_db::chrtag*)new char[sizeof(ZRCola::chrtag_db::chrtag) + sizeof(wchar_t)*m_char.length()]);
std::unique_ptr<ZRCola::chrtag_db::chrtag> ct((ZRCola::chrtag_db::chrtag*)new char[sizeof(ZRCola::chrtag_db::chrtag) + sizeof(ZRCola::char_t)*m_char.length()]);
ct->ZRCola::chrtag_db::chrtag::chrtag(m_char.data(), m_char.length());
ZRCola::chrtag_db::indexChr::size_type ct_start, ct_end;
if (app->m_ct_db.idxChr.find(*ct, ct_start, ct_end)) {
@ -301,7 +301,7 @@ void wxZRColaCharSelect::OnIdle(wxIdleEvent& event)
// Add name to the list.
tag_names.push_back(std::wstring(tn.name(), tn.name_end()));
break;
} else if (ZRCola::tagname_db::tagname::CompareName(m_locale, name->data(), (unsigned __int16)name->length(), tn.name(), tn.name_len()) == 0)
} else if (ZRCola::tagname_db::tagname::CompareName(m_locale, name->data(), (uint16_t)name->length(), tn.name(), tn.name_len()) == 0)
// Name is already on the list.
break;
}
@ -718,7 +718,7 @@ wxThread::ExitCode wxZRColaCharSelect::SearchThread::Entry()
{
// Search by tags: Get tags with given names. Then, get characters of found tags.
std::map<ZRCola::tagid_t, unsigned __int16> hits_tag;
std::map<ZRCola::tagid_t, uint16_t> hits_tag;
if (!app->m_tn_db.Search(m_search.c_str(), m_parent->m_locale, hits_tag, TestDestroyS, this)) return (wxThread::ExitCode)1;
if (!app->m_ct_db.Search(hits_tag, app->m_chr_db, m_cats, hits, TestDestroyS, this)) return (wxThread::ExitCode)1;
}
@ -766,14 +766,14 @@ wxThread::ExitCode wxZRColaCharSelect::SearchThread::Entry()
int __cdecl wxZRColaCharSelect::SearchThread::CompareHits(const void *a, const void *b)
{
const std::pair<ZRCola::charrank_t, wchar_t> *_a = (const std::pair<ZRCola::charrank_t, wchar_t>*)a;
const std::pair<ZRCola::charrank_t, wchar_t> *_b = (const std::pair<ZRCola::charrank_t, wchar_t>*)b;
const std::pair<ZRCola::charrank_t, ZRCola::char_t> *_a = (const std::pair<ZRCola::charrank_t, ZRCola::char_t>*)a;
const std::pair<ZRCola::charrank_t, ZRCola::char_t> *_b = (const std::pair<ZRCola::charrank_t, ZRCola::char_t>*)b;
if (_a->first > _b->first) return -1;
else if (_a->first < _b->first) return 1;
if (_a->first > _b->first) return -1;
if (_a->first < _b->first) return 1;
if (_a->second < _b->second) return -1;
else if (_a->second > _b->second) return 1;
if (_a->second < _b->second) return -1;
if (_a->second > _b->second) return 1;
return 0;
}
@ -843,7 +843,7 @@ bool wxPersistentZRColaCharSelect::Restore()
for (wxStringTokenizer tok(str, wxT("|")); tok.HasMoreTokens(); ) {
wxString chr;
for (wxStringTokenizer tok_chr(tok.GetNextToken(), wxT("+")); tok_chr.HasMoreTokens(); )
chr += (wchar_t)_tcstoul(tok_chr.GetNextToken().c_str(), NULL, 16);
chr += (ZRCola::char_t)_tcstoul(tok_chr.GetNextToken().c_str(), NULL, 16);
val.Add(chr);
}
wnd->m_gridRecent->SetCharacters(val);

View File

@ -40,7 +40,7 @@ public:
///
/// Construct the validator with a value to store data
///
wxZRColaUTF16CharValidator(wchar_t *val = NULL);
wxZRColaUTF16CharValidator(ZRCola::char_t *val = NULL);
///
/// Copies this validator

View File

@ -49,7 +49,7 @@ void wxZRColaComposerPanel::RestoreFromStateFile()
wxFFile file(fileName, wxT("rb"));
if (file.IsOpened()) {
// Load source text.
unsigned __int64 n;
uint64_t n;
file.Read(&n, sizeof(n));
if (!file.Error()) {
wxString source;
@ -98,9 +98,11 @@ void wxZRColaComposerPanel::SynchronizePanels()
// ZRCola decompose first, then re-compose.
app->m_t_db.TranslateInv(app->m_mainWnd->m_composition_id, dst.data(), dst.size(), dst2, &map);
m_mapping.push_back(std::move(map));
map.clear();
app->m_t_db.Translate(app->m_mainWnd->m_composition_id, dst2.data(), dst2.size(), dst, &map);
m_mapping.push_back(std::move(map));
map.clear();
}
// Other translations
@ -109,7 +111,9 @@ void wxZRColaComposerPanel::SynchronizePanels()
for (auto s = sets_begin; s != sets_end; ++s) {
app->m_t_db.Translate(*s, dst.data(), dst.size(), dst2, &map);
m_mapping.push_back(std::move(map));
map.clear();
dst = std::move(dst2);
dst2.clear();
}
m_source->GetSelection(&m_selSource.first, &m_selSource.second);
@ -144,18 +148,24 @@ void wxZRColaComposerPanel::SynchronizePanels()
for (auto s = sets_end; (s--) != sets_begin;) {
app->m_t_db.TranslateInv(*s, dst.data(), dst.size(), dst2, &map);
dst = std::move(dst2);
dst2.clear();
map.invert();
for (auto& m : map)
m.invert();
m_mapping.push_back(std::move(map));
map.clear();
}
if (app->m_mainWnd->m_composition) {
// ZRCola decompose.
app->m_t_db.TranslateInv(app->m_mainWnd->m_composition_id, dst.data(), dst.size(), &app->m_lc_db, app->m_mainWnd->m_settings->m_lang, dst2, &map);
dst = std::move(dst2);
dst2.clear();
map.invert();
for (auto& m : map)
m.invert();
m_mapping.push_back(std::move(map));
map.clear();
}
m_destination->GetSelection(&m_selDestination.first, &m_selDestination.second);
@ -195,16 +205,16 @@ void wxZRColaComposerPanel::OnSourcePaint(wxPaintEvent& event)
m_selSource.second = to;
m_sourceHex->SetSelection(
m_selSourceHex.first = (long)m_mappingSourceHex.to_dst(from),
m_selSourceHex.second = (long)m_mappingSourceHex.to_dst(to ));
m_selSourceHex.first = (long)stdex::src2dst<size_t>(m_mappingSourceHex, from),
m_selSourceHex.second = (long)stdex::src2dst<size_t>(m_mappingSourceHex, to ));
m_destination->SetSelection(
m_selDestination.first = (long)MapToDestination(from),
m_selDestination.second = (long)MapToDestination(to ));
m_destinationHex->SetSelection(
m_selDestinationHex.first = (long)m_mappingDestinationHex.to_dst(m_selDestination.first ),
m_selDestinationHex.second = (long)m_mappingDestinationHex.to_dst(m_selDestination.second));
m_selDestinationHex.first = (long)stdex::src2dst<size_t>(m_mappingDestinationHex, m_selDestination.first ),
m_selDestinationHex.second = (long)stdex::src2dst<size_t>(m_mappingDestinationHex, m_selDestination.second));
}
}
@ -222,16 +232,16 @@ void wxZRColaComposerPanel::OnSourceHexPaint(wxPaintEvent& event)
m_selSourceHex.second = to;
m_source->SetSelection(
m_selSource.first = (long)m_mappingSourceHex.to_src(from),
m_selSource.second = (long)m_mappingSourceHex.to_src(to ));
m_selSource.first = (long)stdex::dst2src<size_t>(m_mappingSourceHex, from),
m_selSource.second = (long)stdex::dst2src<size_t>(m_mappingSourceHex, to ));
m_destination->SetSelection(
m_selDestination.first = (long)MapToDestination(m_selSource.first ),
m_selDestination.second = (long)MapToDestination(m_selSource.second));
m_destinationHex->SetSelection(
m_selDestinationHex.first = (long)m_mappingDestinationHex.to_dst(m_selDestination.first ),
m_selDestinationHex.second = (long)m_mappingDestinationHex.to_dst(m_selDestination.second));
m_selDestinationHex.first = (long)stdex::src2dst<size_t>(m_mappingDestinationHex, m_selDestination.first ),
m_selDestinationHex.second = (long)stdex::src2dst<size_t>(m_mappingDestinationHex, m_selDestination.second));
}
}
@ -268,16 +278,16 @@ void wxZRColaComposerPanel::OnDestinationPaint(wxPaintEvent& event)
m_selDestination.second = to;
m_destinationHex->SetSelection(
m_selDestinationHex.first = (long)m_mappingDestinationHex.to_dst(from),
m_selDestinationHex.second = (long)m_mappingDestinationHex.to_dst(to ));
m_selDestinationHex.first = (long)stdex::src2dst<size_t>(m_mappingDestinationHex, from),
m_selDestinationHex.second = (long)stdex::src2dst<size_t>(m_mappingDestinationHex, to ));
m_source->SetSelection(
m_selSource.first = (long)MapToSource(from),
m_selSource.second = (long)MapToSource(to ));
m_sourceHex->SetSelection(
m_selSourceHex.first = (long)m_mappingSourceHex.to_dst(m_selSource.first ),
m_selSourceHex.second = (long)m_mappingSourceHex.to_dst(m_selSource.second));
m_selSourceHex.first = (long)stdex::src2dst<size_t>(m_mappingSourceHex, m_selSource.first ),
m_selSourceHex.second = (long)stdex::src2dst<size_t>(m_mappingSourceHex, m_selSource.second));
}
}
@ -295,16 +305,16 @@ void wxZRColaComposerPanel::OnDestinationHexPaint(wxPaintEvent& event)
m_selDestinationHex.second = to;
m_destination->SetSelection(
m_selDestination.first = (long)m_mappingDestinationHex.to_src(from),
m_selDestination.second = (long)m_mappingDestinationHex.to_src(to ));
m_selDestination.first = (long)stdex::dst2src<size_t>(m_mappingDestinationHex, from),
m_selDestination.second = (long)stdex::dst2src<size_t>(m_mappingDestinationHex, to ));
m_source->SetSelection(
m_selSource.first = (long)MapToSource(m_selDestination.first ),
m_selSource.second = (long)MapToSource(m_selDestination.second));
m_sourceHex->SetSelection(
m_selSourceHex.first = (long)m_mappingSourceHex.to_dst(m_selSource.first ),
m_selSourceHex.second = (long)m_mappingSourceHex.to_dst(m_selSource.second));
m_selSourceHex.first = (long)stdex::src2dst<size_t>(m_mappingSourceHex, m_selSource.first ),
m_selSourceHex.second = (long)stdex::src2dst<size_t>(m_mappingSourceHex, m_selSource.second));
}
}
@ -408,8 +418,8 @@ void wxZRColaComposerPanel::SetHexValue(wxTextCtrl *wnd, std::pair<long, long> &
wnd->SetValue(hex);
wnd->SetSelection(
range.first = (long)mapping.to_dst(from),
range.second = (long)mapping.to_dst(to ));
range.first = (long)stdex::src2dst<size_t>(mapping, from),
range.second = (long)stdex::src2dst<size_t>(mapping, to ));
}

View File

@ -97,7 +97,7 @@ protected:
inline size_t wxZRColaComposerPanel::MapToDestination(_In_ size_t src) const
{
for (auto m = m_mapping.cbegin(), m_end = m_mapping.cend(); m != m_end; ++m)
src = m->to_dst(src);
src = stdex::src2dst(*m, src);
return src;
}
@ -106,7 +106,7 @@ inline size_t wxZRColaComposerPanel::MapToDestination(_In_ size_t src) const
inline size_t wxZRColaComposerPanel::MapToSource(_In_ size_t dst) const
{
for (auto m = m_mapping.crbegin(), m_end = m_mapping.crend(); m != m_end; ++m)
dst = m->to_src(dst);
dst = stdex::dst2src(*m, dst);
return dst;
}

View File

@ -42,7 +42,7 @@ void ZRCola::DBSource::character_bank::build_related()
ZRCola::DBSource::character_bank::build_related_worker::build_related_worker(_In_ const character_bank *cb, _In_ iterator from, _In_ iterator to) :
win_handle<INVALID_HANDLE_VALUE>((HANDLE)_beginthreadex(NULL, 0, process, this, CREATE_SUSPENDED, NULL)),
winstd::thread((HANDLE)_beginthreadex(NULL, 0, process, this, CREATE_SUSPENDED, NULL)),
m_heap(HeapCreate(0, 0, 0)),
m_cb(cb),
m_from(from),
@ -186,7 +186,7 @@ void ZRCola::DBSource::character_desc_idx::add_keywords(const set<wstring> &term
}
void ZRCola::DBSource::character_desc_idx::save(ZRCola::textindex<wchar_t, wchar_t, unsigned __int32> &idx) const
void ZRCola::DBSource::character_desc_idx::save(ZRCola::textindex<wchar_t, wchar_t, uint32_t> &idx) const
{
idx .clear();
idx.keys .clear();
@ -205,7 +205,7 @@ void ZRCola::DBSource::character_desc_idx::save(ZRCola::textindex<wchar_t, wchar
// Convert the index.
for (const_iterator i = cbegin(), i_end = cend(); i != i_end; ++i) {
ZRCola::mappair_t<unsigned __int32> p = { idx.keys.size(), idx.values.size() };
ZRCola::mappair_t<uint32_t> p = { idx.keys.size(), idx.values.size() };
idx.push_back(p);
idx.keys.insert(idx.keys.end(), i->first.cbegin(), i->first.cend());
idx.values.insert(idx.values.end(), i->second.cbegin(), i->second.cend());
@ -670,7 +670,7 @@ bool ZRCola::DBSource::GetTagNames(const winstd::com_obj<ADOField>& f, LCID lcid
// Add name to the list.
names.push_back(std::move(name));
break;
} else if (ZRCola::tagname_db::tagname::CompareName(lcid, n->data(), (unsigned __int16)n->length(), name.data(), (unsigned __int16)name.length()) == CSTR_EQUAL) {
} else if (ZRCola::tagname_db::tagname::CompareName(lcid, n->data(), (uint16_t)n->length(), name.data(), (uint16_t)name.length()) == CSTR_EQUAL) {
// Name is already on the list.
break;
}

View File

@ -222,11 +222,11 @@ namespace ZRCola {
///
class chrgrp {
public:
short grp; ///< Character group ID
short rank; ///< Rank
std::wstring name; ///< Name
std::vector<wchar_t> chars; ///< Characters (zero-delimited)
std::vector<unsigned __int16> show; ///< Bit vector if particular character from \c chars is displayed initially
short grp; ///< Character group ID
short rank; ///< Rank
std::wstring name; ///< Name
std::vector<wchar_t> chars; ///< Characters (zero-delimited)
std::vector<uint16_t> show; ///< Bit vector if particular character from \c chars is displayed initially
inline chrgrp() : grp(0), rank(0) {}
};
@ -275,7 +275,7 @@ namespace ZRCola {
void build_related();
protected:
class build_related_worker : public winstd::win_handle<INVALID_HANDLE_VALUE>
class build_related_worker : public winstd::thread
{
public:
build_related_worker(_In_ const character_bank *cb, _In_ iterator from, _In_ iterator to);
@ -308,19 +308,12 @@ namespace ZRCola {
///
/// Character description index key comparator
///
struct character_desc_idx_less : public std::binary_function<std::wstring, std::wstring, bool>
struct character_desc_idx_less
{
inline bool operator()(const std::wstring& _Left, const std::wstring& _Right) const
{
size_t
_Left_len = _Left .size(),
_Right_len = _Right.size();
int r = _wcsncoll(_Left.c_str(), _Right.c_str(), std::min<size_t>(_Left_len, _Right_len));
if (r != 0 ) return r < 0;
else if (_Left_len < _Right_len) return true;
return false;
auto &coll = std::use_facet<std::collate<wchar_t>>(std::locale());
return coll.compare(&*_Left.cbegin(), &*_Left.cend(), &*_Right.cbegin(), &*_Right.cend()) < 0;
}
};
@ -340,7 +333,7 @@ namespace ZRCola {
add_keywords(terms, chr, sub);
}
void save(ZRCola::textindex<wchar_t, wchar_t, unsigned __int32> &idx) const;
void save(ZRCola::textindex<wchar_t, wchar_t, uint32_t> &idx) const;
protected:
inline void add_keyword(const std::wstring &term, const std::wstring &chr)
@ -920,16 +913,16 @@ namespace ZRCola {
inline ZRCola::translation_db& operator<<(_Inout_ ZRCola::translation_db &db, _In_ const ZRCola::DBSource::translation &rec)
{
unsigned __int32 idx = db.data.size();
db.data.push_back((unsigned __int16)rec.set);
db.data.push_back((unsigned __int16)rec.dst.rank);
db.data.push_back((unsigned __int16)rec.src.rank);
uint32_t idx = db.data.size();
db.data.push_back((uint16_t)rec.set);
db.data.push_back((uint16_t)rec.dst.rank);
db.data.push_back((uint16_t)rec.src.rank);
std::wstring::size_type n = rec.dst.str.length();
wxASSERT_MSG(n <= 0xffff, wxT("destination overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
n += rec.src.str.length();
wxASSERT_MSG(n <= 0xffff, wxT("source overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.dst.str.cbegin(), rec.dst.str.cend());
db.data.insert(db.data.end(), rec.src.str.cbegin(), rec.src.str.cend());
db.idxSrc.push_back(idx);
@ -941,14 +934,14 @@ inline ZRCola::translation_db& operator<<(_Inout_ ZRCola::translation_db &db, _I
inline ZRCola::transet_db& operator<<(_Inout_ ZRCola::transet_db &db, _In_ const ZRCola::DBSource::transet &rec)
{
unsigned __int32 idx = db.data.size();
db.data.push_back((unsigned __int16)rec.set);
uint32_t idx = db.data.size();
db.data.push_back((uint16_t)rec.set);
std::wstring::size_type n = rec.src.length();
wxASSERT_MSG(n <= 0xffff, wxT("translation set source name overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
n += rec.dst.length();
wxASSERT_MSG(n <= 0xffff, wxT("translation set destination name overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.src.cbegin(), rec.src.cend());
db.data.insert(db.data.end(), rec.dst.cbegin(), rec.dst.cend());
db.idxTranSet.push_back(idx);
@ -959,18 +952,18 @@ inline ZRCola::transet_db& operator<<(_Inout_ ZRCola::transet_db &db, _In_ const
inline ZRCola::transeq_db& operator<<(_Inout_ ZRCola::transeq_db &db, _In_ const ZRCola::DBSource::transeq &rec)
{
unsigned __int32 idx = db.data.size();
db.data.push_back((unsigned __int16)rec.seq);
db.data.push_back((unsigned __int16)rec.rank);
uint32_t idx = db.data.size();
db.data.push_back((uint16_t)rec.seq);
db.data.push_back((uint16_t)rec.rank);
std::wstring::size_type n = rec.name.length();
wxASSERT_MSG(n <= 0xffff, wxT("translation sequence name overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
n += rec.sets.size();
wxASSERT_MSG(n <= 0xffff, wxT("translation sequence sets overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.name.cbegin(), rec.name.cend());
for (auto s = rec.sets.cbegin(), s_end = rec.sets.cend(); s != s_end; ++s)
db.data.push_back((unsigned __int16)*s);
db.data.push_back((uint16_t)*s);
db.idxTranSeq.push_back(idx);
db.idxRank .push_back(idx);
@ -980,13 +973,13 @@ inline ZRCola::transeq_db& operator<<(_Inout_ ZRCola::transeq_db &db, _In_ const
inline ZRCola::keyseq_db& operator<<(_Inout_ ZRCola::keyseq_db &db, _In_ const ZRCola::DBSource::keyseq &rec)
{
unsigned __int32 idx = db.data.size();
uint32_t idx = db.data.size();
std::wstring::size_type n = rec.chr.length();
wxASSERT_MSG(n <= 0xffff, wxT("character overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
n += rec.seq.size() * sizeof(ZRCola::keyseq_db::keyseq::key_t) / sizeof(wchar_t);
wxASSERT_MSG(n <= 0xffff, wxT("key sequence overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.chr.cbegin(), rec.chr.cend());
for (auto kc = rec.seq.cbegin(), kc_end = rec.seq.cend(); kc != kc_end; ++kc) {
db.data.push_back(kc->key);
@ -1004,11 +997,11 @@ inline ZRCola::keyseq_db& operator<<(_Inout_ ZRCola::keyseq_db &db, _In_ const Z
inline ZRCola::language_db& operator<<(_Inout_ ZRCola::language_db &db, _In_ const ZRCola::DBSource::language &rec)
{
unsigned __int32 idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const unsigned __int16*>(&rec.lang), reinterpret_cast<const unsigned __int16*>(&rec.lang + 1));
uint32_t idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const uint16_t*>(&rec.lang), reinterpret_cast<const uint16_t*>(&rec.lang + 1));
std::wstring::size_type n = rec.name.length();
wxASSERT_MSG(n <= 0xffff, wxT("language name overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.name.cbegin(), rec.name.cend());
db.idxLang.push_back(idx);
@ -1018,11 +1011,11 @@ inline ZRCola::language_db& operator<<(_Inout_ ZRCola::language_db &db, _In_ con
inline ZRCola::langchar_db& operator<<(_Inout_ ZRCola::langchar_db &db, _In_ const ZRCola::DBSource::langchar &rec)
{
unsigned __int32 idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const unsigned __int16*>(&rec.lang), reinterpret_cast<const unsigned __int16*>(&rec.lang + 1));
uint32_t idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const uint16_t*>(&rec.lang), reinterpret_cast<const uint16_t*>(&rec.lang + 1));
std::wstring::size_type n = rec.chr.length();
wxASSERT_MSG(n <= 0xffff, wxT("character overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.chr.cbegin(), rec.chr.cend());
db.idxChr .push_back(idx);
#ifdef ZRCOLA_LANGCHAR_LANG_IDX
@ -1035,15 +1028,15 @@ inline ZRCola::langchar_db& operator<<(_Inout_ ZRCola::langchar_db &db, _In_ con
inline ZRCola::chrgrp_db& operator<<(_Inout_ ZRCola::chrgrp_db &db, _In_ const ZRCola::DBSource::chrgrp &rec)
{
unsigned __int32 idx = db.data.size();
db.data.push_back((unsigned __int16)rec.grp);
db.data.push_back((unsigned __int16)rec.rank);
uint32_t idx = db.data.size();
db.data.push_back((uint16_t)rec.grp);
db.data.push_back((uint16_t)rec.rank);
std::wstring::size_type n = rec.name.length();
wxASSERT_MSG(n <= 0xffff, wxT("character group name overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
n += rec.chars.size();
wxASSERT_MSG(n <= 0xffff, wxT("character group characters overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.name .cbegin(), rec.name .cend());
db.data.insert(db.data.end(), rec.chars.cbegin(), rec.chars.cend());
db.data.insert(db.data.end(), rec.show .cbegin(), rec.show .cend());
@ -1055,17 +1048,17 @@ inline ZRCola::chrgrp_db& operator<<(_Inout_ ZRCola::chrgrp_db &db, _In_ const Z
inline ZRCola::character_db& operator<<(_Inout_ ZRCola::character_db &db, _In_ const ZRCola::DBSource::character &rec)
{
unsigned __int32 idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const unsigned __int16*>(&rec.second.cat), reinterpret_cast<const unsigned __int16*>(&rec.second.cat + 1));
uint32_t idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const uint16_t*>(&rec.second.cat), reinterpret_cast<const uint16_t*>(&rec.second.cat + 1));
std::wstring::size_type n = rec.first.length();
wxASSERT_MSG(n <= 0xffff, wxT("character overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
n += rec.second.desc.length();
wxASSERT_MSG(n <= 0xffff, wxT("character description overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
n += rec.second.rel.size();
wxASSERT_MSG(n <= 0xffff, wxT("related characters overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.first .cbegin(), rec.first .cend());
db.data.insert(db.data.end(), rec.second.desc.cbegin(), rec.second.desc.cend());
db.data.insert(db.data.end(), rec.second.rel .cbegin(), rec.second.rel .cend());
@ -1077,12 +1070,12 @@ inline ZRCola::character_db& operator<<(_Inout_ ZRCola::character_db &db, _In_ c
inline ZRCola::chrcat_db& operator<<(_Inout_ ZRCola::chrcat_db &db, _In_ const ZRCola::DBSource::chrcat &rec)
{
unsigned __int32 idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const unsigned __int16*>(&rec.cat), reinterpret_cast<const unsigned __int16*>(&rec.cat + 1));
db.data.push_back((unsigned __int16)rec.rank);
uint32_t idx = db.data.size();
db.data.insert(db.data.end(), reinterpret_cast<const uint16_t*>(&rec.cat), reinterpret_cast<const uint16_t*>(&rec.cat + 1));
db.data.push_back((uint16_t)rec.rank);
std::wstring::size_type n = rec.name.length();
wxASSERT_MSG(n <= 0xffff, wxT("character category name overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.name.cbegin(), rec.name.cend());
db.idxChrCat.push_back(idx);
db.idxRank .push_back(idx);
@ -1093,11 +1086,11 @@ inline ZRCola::chrcat_db& operator<<(_Inout_ ZRCola::chrcat_db &db, _In_ const Z
inline ZRCola::chrtag_db& operator<<(_Inout_ ZRCola::chrtag_db &db, _In_ const ZRCola::DBSource::chrtag &rec)
{
unsigned __int32 idx = db.data.size();
db.data.push_back((unsigned __int16)rec.tag);
uint32_t idx = db.data.size();
db.data.push_back((uint16_t)rec.tag);
std::wstring::size_type n = rec.chr.length();
wxASSERT_MSG(n <= 0xffff, wxT("character overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.chr.cbegin(), rec.chr.cend());
db.idxChr.push_back(idx);
db.idxTag.push_back(idx);
@ -1110,13 +1103,13 @@ inline ZRCola::tagname_db& operator<<(_Inout_ ZRCola::tagname_db &db, _In_ const
{
for (auto ln = rec.names.cbegin(), ln_end = rec.names.cend(); ln != ln_end; ++ln) {
for (auto nm = ln->second.cbegin(), nm_end = ln->second.cend(); nm != nm_end; ++nm) {
unsigned __int32 idx = db.data.size();
db.data.push_back((unsigned __int16)rec.tag);
uint32_t idx = db.data.size();
db.data.push_back((uint16_t)rec.tag);
db.data.push_back(LOWORD(ln->first));
db.data.push_back(HIWORD(ln->first));
std::wstring::size_type n = nm->length();
wxASSERT_MSG(n <= 0xffff, wxT("tag name overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), nm->cbegin(), nm->cend());
db.idxName.push_back(idx);
db.idxTag .push_back(idx);
@ -1129,11 +1122,11 @@ inline ZRCola::tagname_db& operator<<(_Inout_ ZRCola::tagname_db &db, _In_ const
inline ZRCola::highlight_db& operator<<(_Inout_ ZRCola::highlight_db &db, _In_ const ZRCola::DBSource::highlight &rec)
{
unsigned __int32 idx = db.data.size();
db.data.push_back((unsigned __int16)rec.set);
uint32_t idx = db.data.size();
db.data.push_back((uint16_t)rec.set);
std::wstring::size_type n = rec.chr.length();
wxASSERT_MSG(n <= 0xffff, wxT("character overflow"));
db.data.push_back((unsigned __int16)n);
db.data.push_back((uint16_t)n);
db.data.insert(db.data.end(), rec.chr.cbegin(), rec.chr.cend());
db.idxChr.push_back(idx);

View File

@ -981,10 +981,10 @@ int _tmain(int argc, _TCHAR *argv[])
<< "\"Content-Transfer-Encoding: 8bit\\n\"" << endl
<< "\"X-Generator: ZRColaCompile\\n\"" << endl;
wstring_convert<codecvt_utf8<wchar_t>> conv;
charset_encoder<wchar_t, char> conv(stdex::wchar_t_charset, charset_id::utf8);
for (auto p = pot.cbegin(); p != pot.cend(); ++p) {
// Convert UTF-16 to UTF-8 and escape.
string t(conv.to_bytes(*p)), u;
string t(conv.convert(*p)), u;
for (size_t i = 0, n = t.size(); i < n; i++) {
char c = t[i];
switch (c) {

View File

@ -20,7 +20,7 @@
#include <wx/intl.h>
#pragma warning(pop)
#include <stdex/idrec>
#include <zrcola/idrec.h>
#include <WinStd/Common.h>

3
ZRColaWS/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
/*.d
/*.o
/zrcolaws

91
ZRColaWS/Makefile Normal file
View File

@ -0,0 +1,91 @@
CPPFLAGS := $(CPPFLAGS) -I../lib/libZRCola/include -I../lib/stdex/include -I../lib/oatpp-swagger/src -I../lib/oatpp/src
LDFLAGS := $(LDFLAGS) -L../lib/libZRCola/lib -L../lib/oatpp-swagger/build/src -L../lib/oatpp/build/src
LDLIBS := $(LDLIBS) -lZRCola -loatpp-swagger -loatpp -lstdc++
SRCS := zrcolaws.cpp
include ../include/props.mak
.PHONY: all
all: zrcolaws
zrcolaws: \
../lib/oatpp/build/src/liboatpp.a \
../lib/oatpp-swagger/build/src/liboatpp-swagger.a \
../lib/libZRCola/lib/libZRCola.a \
$(OBJS)
$(CC) $(LDFLAGS) $(OBJS) -o $@ $(LDLIBS)
../lib/oatpp/build/src/liboatpp.a: ../lib/oatpp/build/Makefile
$(MAKE) $(MFLAGS) -C ../lib/oatpp/build
../lib/oatpp/build/Makefile: ../lib/oatpp/CMakeLists.txt
cmake -D OATPP_INSTALL=OFF -D OATPP_BUILD_TESTS=OFF -D OATPP_LINK_TEST_LIBRARY=OFF -S ../lib/oatpp -B ../lib/oatpp/build
../lib/oatpp-swagger/build/src/liboatpp-swagger.a: ../lib/oatpp-swagger/build/Makefile
$(MAKE) $(MFLAGS) -C ../lib/oatpp-swagger/build
../lib/oatpp-swagger/build/Makefile: ../lib/oatpp-swagger/CMakeLists.txt
cmake -D OATPP_INSTALL=OFF -D OATPP_BUILD_TESTS=OFF -D OATPP_MODULES_LOCATION=CUSTOM -D OATPP_DIR_SRC=${CURDIR}/../lib/oatpp/src -D OATPP_DIR_LIB=${CURDIR}/../lib/oatpp/build/src -S ../lib/oatpp-swagger -B ../lib/oatpp-swagger/build
../lib/libZRCola/lib/libZRCola.a:
$(MAKE) $(MFLAGS) -C ../lib/libZRCola/build
.PHONY: install
install: zrcolaws ../output/data/ZRCola.zrcdb
install -d $(PREFIX)/bin/
install -m 755 zrcolaws $(PREFIX)/bin/
install -d $(PREFIX)/share/zrcola/
install -m 644 ../output/data/ZRCola.zrcdb $(PREFIX)/share/zrcola/
install -d $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/favicon-16x16.png $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/favicon-32x32.png $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/index.html $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/oauth2-redirect.html $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-bundle.js $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-bundle.js.map $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-es-bundle-core.js $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-es-bundle-core.js.map $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-es-bundle.js $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-es-bundle.js.map $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-standalone-preset.js $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui-standalone-preset.js.map $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui.css $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui.css.map $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui.js $(PREFIX)/share/zrcola/res/
install -m 644 ../lib/oatpp-swagger/res/swagger-ui.js.map $(PREFIX)/share/zrcola/res/
install -d $(PREFIX)/etc/sysconfig/
{ echo '#!/bin/sh'; \
echo ''; \
echo '# Interface to listen on. Default: localhost'; \
echo '#HOST=0.0.0.0'; \
echo ''; \
echo '# Port to listen on. Default: 54591'; \
echo '#PORT=54591'; \
echo ''; \
echo '# Additional zrcolaws command line options'; \
echo '#OPTIONS=-4'; \
} > $(PREFIX)/etc/sysconfig/zrcolaws
install -d $(PREFIX)/lib/systemd/system/
{ echo '[Unit]'; \
echo 'Description=ZRCola Web Service'; \
echo 'After=network.target'; \
echo ''; \
echo '[Service]'; \
echo 'Environment="HOST=localhost" "PORT=54591"'; \
echo 'EnvironmentFile=-$(PREFIX)/etc/sysconfig/zrcolaws'; \
echo 'DynamicUser=yes'; \
echo 'ExecStart=/usr/local/bin/zrcolaws --host $$HOST --port $$PORT $$OPTIONS'; \
echo 'Type=exec'; \
echo 'Restart=always'; \
echo ''; \
echo '[Install]'; \
echo 'WantedBy=multi-user.target'; \
echo 'RequiredBy=network.target'; \
} > $(PREFIX)/lib/systemd/system/zrcolaws.service
.PHONY: clean
clean:
-rm -r *.{d,o} zrcolaws
include ../include/targets.mak
-include $(DEPS)

23
ZRColaWS/README.md Normal file
View File

@ -0,0 +1,23 @@
# ZRCola Web Service
## Building and Installation
1. Install prerequisites: `sudo dnf install git cmake gcc gcc-c++ make libatomic`
2. Clone this Git repository: `git clone --recursive https://github.com/Amebis/ZRCola.git`
3. Build ZRColaWS: `make -C ZRCola/ZRColaWS`
4. Install ZRColaWS: `sudo make -C ZRCola/ZRColaWS install`
5. Configure ZRColaWS by editing _/usr/local/etc/sysconfig/zrcolaws_
6. Start ZRColaWS: `sudo systemctl start zrcolaws.service`
7. Configure ZRColaWS for auto-start: `sudo systemctl enable zrcolaws.service`
8. Remember to open webservice port in firewall, should the service require external access.
## Usage
Web service API documentation is available at _http://host:port/swagger/ui_.

73
ZRColaWS/appcomponent.hpp Normal file
View File

@ -0,0 +1,73 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#pragma once
#include "../include/version.h"
#include "controller.hpp"
#include <oatpp-swagger/Model.hpp>
#include <oatpp-swagger/Resources.hpp>
#include <oatpp/core/base/CommandLineArguments.hpp>
#include <oatpp/core/macro/component.hpp>
#include <oatpp/network/Server.hpp>
#include <oatpp/network/tcp/server/ConnectionProvider.hpp>
#include <oatpp/parser/json/mapping/ObjectMapper.hpp>
#include <oatpp/web/server/HttpConnectionHandler.hpp>
class AppComponent
{
protected:
oatpp::network::Address m_address;
public:
AppComponent(const oatpp::network::Address& address) : m_address(address) {}
OATPP_CREATE_COMPONENT(std::shared_ptr<oatpp::network::ServerConnectionProvider>, serverConnectionProvider)([this] {
return oatpp::network::tcp::server::ConnectionProvider::createShared({m_address.host, m_address.port, m_address.family});
}());
OATPP_CREATE_COMPONENT(std::shared_ptr<oatpp::web::server::HttpRouter>, httpRouter)([] {
return oatpp::web::server::HttpRouter::createShared();
}());
OATPP_CREATE_COMPONENT(std::shared_ptr<oatpp::network::ConnectionHandler>, serverConnectionHandler)([] {
OATPP_COMPONENT(std::shared_ptr<oatpp::web::server::HttpRouter>, router);
return oatpp::web::server::HttpConnectionHandler::createShared(router);
}());
OATPP_CREATE_COMPONENT(std::shared_ptr<oatpp::data::mapping::ObjectMapper>, apiObjectMapper)([] {
auto serializerConfig = oatpp::parser::json::mapping::Serializer::Config::createShared();
serializerConfig->escapeFlags &= ~oatpp::parser::json::Utils::FLAG_ESCAPE_UTF8CHAR;
return oatpp::parser::json::mapping::ObjectMapper::createShared(
serializerConfig,
oatpp::parser::json::mapping::Deserializer::Config::createShared());
}());
OATPP_CREATE_COMPONENT(std::shared_ptr<oatpp::network::Server>, server)([] {
OATPP_COMPONENT(std::shared_ptr<oatpp::network::ServerConnectionProvider>, connectionProvider);
OATPP_COMPONENT(std::shared_ptr<oatpp::network::ConnectionHandler>, connectionHandler);
return oatpp::network::Server::createShared(connectionProvider, connectionHandler);
}());
OATPP_CREATE_COMPONENT(std::shared_ptr<oatpp::swagger::DocumentInfo>, swaggerDocumentInfo)([] {
oatpp::swagger::DocumentInfo::Builder builder;
builder
.setTitle("ZRCola Web Service")
.setDescription(
"ZRCola is an input system designed mainly, although not exclusively, for linguistic use. "
"It allows the user to combine basic letters with any diacritic marks and insert the resulting complex characters into the texts with ease.")
.setVersion(PRODUCT_VERSION_STR)
.setContactName("ZRCola")
.setContactUrl("https://zrcola.zrc-sazu.si/en/")
.setLicenseName("GNU General Public License, Version 3")
.setLicenseUrl("https://www.gnu.org/licenses/gpl-3.0.en.html");
return builder.build();
}());
OATPP_CREATE_COMPONENT(std::shared_ptr<oatpp::swagger::Resources>, swaggerResources)([] {
return oatpp::swagger::Resources::loadResources(PREFIX "/share/zrcola/res");
}());
};

243
ZRColaWS/controller.hpp Normal file
View File

@ -0,0 +1,243 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#pragma once
#include "../include/version.h"
#include "dto.hpp"
#include "iconverter.hpp"
#include "zrcolaws.hpp"
#include <zrcola/translate.h>
#include <oatpp/core/macro/codegen.hpp>
#include <oatpp/core/macro/component.hpp>
#include <oatpp/web/server/api/ApiController.hpp>
#include OATPP_CODEGEN_BEGIN(ApiController)
class Controller : public oatpp::web::server::api::ApiController
{
public:
Controller(const std::shared_ptr<oatpp::data::mapping::ObjectMapper>& defaultObjectMapper, const oatpp::String &routerPrefix = nullptr) :
oatpp::web::server::api::ApiController(defaultObjectMapper, routerPrefix)
{}
ADD_CORS(getAbout)
ENDPOINT_INFO(getAbout) {
info->summary = "Returns service information";
info->addResponse<Object<dto::About>>(Status::CODE_200, "application/json");
}
ENDPOINT("GET", "/about", getAbout)
{
auto dto = dto::About::createShared();
dto->vendor = PRODUCT_CFG_VENDOR;
dto->application = PRODUCT_CFG_APPLICATION;
dto->version = PRODUCT_VERSION_STR;
return createDtoResponse(Status::CODE_200, dto);
}
ADD_CORS(getTranset)
ENDPOINT_INFO(getTranset) {
info->summary = "Lists supported translation sets";
info->description = "Each translation set describes a set of replacements that are performed to translate text from one script or encoding to another. ";
info->addResponse<oatpp::Vector<oatpp::Object<dto::TranSet>>>(Status::CODE_200, "application/json");
}
ENDPOINT("GET", "/transet", getTranset)
{
try {
utf16toutf8 c;
auto result = oatpp::Vector<oatpp::Object<dto::TranSet>>::createShared();
auto dto = dto::TranSet::createShared();
dto->set = ZRCOLA_TRANSETID_DEFAULT;
dto->src = "ZRCola Decomposed";
dto->dst = "ZRCola Composed";
result->push_back(dto);
for (size_t i = 0, n = ts_db.idxTranSet.size(); i < n; i++) {
const auto &ts = ts_db.idxTranSet[i];
dto = dto::TranSet::createShared();
dto->set = ts.set;
dto->src = c.convert(ts.src(), ts.src_len());
dto->dst = c.convert(ts.dst(), ts.dst_len());
result->push_back(dto);
}
dto = dto::TranSet::createShared();
dto->set = ZRCOLA_TRANSETID_UNICODE;
dto->src = "ZRCola Decomposed";
dto->dst = "Unicode";
result->push_back(dto);
return createDtoResponse(Status::CODE_200, result);
} catch (std::exception &ex) {
OATPP_LOGE(__FUNCTION__, "%s: %s", typeid(ex).name(), ex.what());
return ResponseFactory::createResponse(Status::CODE_500, ex.what());
}
}
ADD_CORS(getLanguage)
ENDPOINT_INFO(getLanguage) {
info->summary = "Lists supported languages";
info->description = "Each language describes a set of special characters that are specific to that language (e.g. č, š, ž in Slovenian, or ä, ö, ü in German).";
info->addResponse<oatpp::Vector<oatpp::Object<dto::Language>>>(Status::CODE_200, "application/json");
}
ENDPOINT("GET", "/language", getLanguage)
{
try {
utf16toutf8 c;
auto result = oatpp::Vector<oatpp::Object<dto::Language>>::createShared();
for (size_t i = 0, n = lang_db.idxLang.size(); i < n; i++) {
const auto &lang = lang_db.idxLang[i];
auto dto = dto::Language::createShared();
dto->lang = std::string(&lang.lang.data[0], strnlen(lang.lang.data, std::size(lang.lang.data)));
dto->name = c.convert(lang.name(), lang.name_len());
result->push_back(dto);
}
return createDtoResponse(Status::CODE_200, result);
} catch (std::exception &ex) {
OATPP_LOGE(__FUNCTION__, "%s: %s", typeid(ex).name(), ex.what());
return ResponseFactory::createResponse(Status::CODE_500, ex.what());
}
}
ADD_CORS(postTranslate)
ENDPOINT_INFO(postTranslate) {
info->summary = "Translate text";
info->description =
"Performs any number of supported translations (see /transet) on a given input text in a sequence. "
"Together with the output text, it also returns character index mapping between input and output texts.";
auto transet = oatpp::Vector<UInt16>::createShared();
transet->push_back(ZRCOLA_TRANSETID_DEFAULT);
auto dto = dto::TranslateIn::createShared();
dto->transet = transet;
dto->text = "To je test.";
info->addConsumes<Object<dto::TranslateIn>>("application/json")
.addExample("Perform ZRCola composition", dto);
info->addResponse<Object<dto::TranslateOut>>(Status::CODE_200, "application/json");
}
ENDPOINT("POST", "/translate", postTranslate, BODY_DTO(Object<dto::TranslateIn>, input))
{
try {
utf8toutf16 cIn;
std::u16string dst, dst2;
if (input->text)
dst = cIn.convert(*input->text);
size_t src_len = dst.size();
std::vector<ZRCola::mapping_vector> mapping;
if (input->transet) {
ZRCola::mapping_vector map;
const auto ts_end = input->transet->cend();
for (auto ts = input->transet->cbegin(); ts != ts_end; ++ts) {
switch (*ts) {
case ZRCOLA_TRANSETID_DEFAULT:
case ZRCOLA_TRANSETID_UNICODE:
// Decompose first, then re-compose.
t_db.TranslateInv(*ts, dst.data(), dst.size(), dst2, &map);
mapping.push_back(std::move(map));
map.clear();
t_db.Translate(*ts, dst2.data(), dst2.size(), dst, &map);
mapping.push_back(std::move(map));
map.clear();
break;
default:
t_db.Translate(*ts, dst.data(), dst.size(), dst2, &map);
mapping.push_back(std::move(map));
map.clear();
dst = std::move(dst2);
dst2.clear();
}
}
}
utf16toutf8 cOut;
auto dto = dto::TranslateOut::createShared();
dto->text = cOut.convert(dst);
auto map = oatpp::Vector<oatpp::UInt32>::createShared();
auto m_end = mapping.cend();
for (size_t i = 0; i < src_len; ++i) {
auto j = i;
for (auto m = mapping.cbegin(); m != m_end; ++m)
j = m->to_dst(j);
map->push_back(j);
}
dto->map = map;
return createDtoResponse(Status::CODE_200, dto);
} catch (std::exception &ex) {
OATPP_LOGE(__FUNCTION__, "%s: %s", typeid(ex).name(), ex.what());
return ResponseFactory::createResponse(Status::CODE_500, ex.what());
}
}
ADD_CORS(postTranslateInv)
ENDPOINT_INFO(postTranslateInv) {
info->summary = "Inverse translate text";
info->description =
"Performs any number of supported translations (see /transet) on a given input text in a sequence in reverse. "
"Together with the output text, it also returns character index mapping between input and output texts.";
auto transet = oatpp::Vector<UInt16>::createShared();
transet->push_back(ZRCOLA_TRANSETID_DEFAULT);
auto dto = dto::TranslateIn::createShared();
dto->transet = transet;
dto->text = "T  ťᵉⓢṭ.";
dto->lang = "slv";
info->addConsumes<Object<dto::TranslateIn>>("application/json")
.addExample("Perform ZRCola decomposition", dto);
info->addResponse<Object<dto::TranslateOut>>(Status::CODE_200, "application/json");
}
ENDPOINT("POST", "/translateInv", postTranslateInv, BODY_DTO(Object<dto::TranslateIn>, input))
{
try {
utf8toutf16 cIn;
std::u16string dst, dst2;
ZRCola::langid_t lang;
if (input->text)
dst = cIn.convert(*input->text);
lang = input->lang->c_str();
size_t src_len = dst.size();
std::vector<ZRCola::mapping_vector> mapping;
if (input->transet) {
ZRCola::mapping_vector map;
const auto ts_begin = input->transet->cbegin();
for (auto ts = input->transet->cend(); (ts--) != ts_begin; ) {
switch (*ts) {
case ZRCOLA_TRANSETID_DEFAULT:
case ZRCOLA_TRANSETID_UNICODE:
t_db.TranslateInv(*ts, dst.data(), dst.size(), &lc_db, lang, dst2, &map);
dst = std::move(dst2);
dst2.clear();
map.invert();
mapping.push_back(std::move(map));
map.clear();
break;
default:
t_db.TranslateInv(*ts, dst.data(), dst.size(), dst2, &map);
dst = std::move(dst2);
dst2.clear();
map.invert();
mapping.push_back(std::move(map));
map.clear();
}
}
}
utf16toutf8 cOut;
auto dto = dto::TranslateOut::createShared();
dto->text = cOut.convert(dst);
auto map = oatpp::Vector<oatpp::UInt32>::createShared();
auto m_end = mapping.crend();
for (size_t i = 0; i < src_len; ++i) {
auto j = i;
for (auto m = mapping.crbegin(); m != m_end; ++m)
j = m->to_src(j);
map->push_back(j);
}
dto->map = map;
return createDtoResponse(Status::CODE_200, dto);
} catch (std::exception &ex) {
OATPP_LOGE(__FUNCTION__, "%s: %s", typeid(ex).name(), ex.what());
return ResponseFactory::createResponse(Status::CODE_500, ex.what());
}
}
};
#include OATPP_CODEGEN_END(ApiController)

90
ZRColaWS/dto.hpp Normal file
View File

@ -0,0 +1,90 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#pragma once
#include <oatpp/core/data/mapping/type/Object.hpp>
#include <oatpp/core/macro/codegen.hpp>
#include <oatpp/core/Types.hpp>
#include OATPP_CODEGEN_BEGIN(DTO)
namespace dto {
class About : public oatpp::DTO
{
DTO_INIT(About, DTO)
DTO_FIELD_INFO(vendor) { info->description = "Application vendor"; }
DTO_FIELD(String, vendor);
DTO_FIELD_INFO(application) { info->description = "Application name"; }
DTO_FIELD(String, application);
DTO_FIELD_INFO(version) { info->description = "Application version"; }
DTO_FIELD(String, version);
};
class TranSet : public oatpp::DTO
{
DTO_INIT(TranSet, DTO)
DTO_FIELD_INFO(set) { info->description = "Translation set ID"; }
DTO_FIELD(UInt16, set);
DTO_FIELD_INFO(src) { info->description = "Input transcript name in English"; }
DTO_FIELD(String, src);
DTO_FIELD_INFO(dst) { info->description = "Output transcript name in English"; }
DTO_FIELD(String, dst);
};
class Language : public oatpp::DTO
{
DTO_INIT(Language, DTO)
DTO_FIELD_INFO(lang) { info->description = "Language ID"; }
DTO_FIELD(String, lang);
DTO_FIELD_INFO(name) { info->description = "Language name in English"; }
DTO_FIELD(String, name);
};
class TranslateIn : public oatpp::DTO
{
DTO_INIT(TranslateIn, DTO)
DTO_FIELD_INFO(transet) {
info->description = "Array of one or multiple translation set IDs to perform translation on the text. When inverse translating, the translation sets are read in reverse order listed in this array. Use /transet to get IDs of all supported translation sets.";
info->required = true;
}
DTO_FIELD(Vector<UInt16>, transet);
DTO_FIELD_INFO(text) {
info->description = "Text to be translated";
info->required = true;
}
DTO_FIELD(String, text);
DTO_FIELD_INFO(lang) {
info->description = "Language ID of the text to be translated. This is used on inverse translating to skip decomposing language-specific common characters (e.g. č, š, ž in Slovenian, or ä, ö, ü in German). Use /language to get IDs of all supported languages.";
}
DTO_FIELD(String, lang) = "slv";
};
class TranslateOut : public oatpp::DTO
{
DTO_INIT(TranslateOut, DTO)
DTO_FIELD_INFO(text) { info->description = "Translated text"; }
DTO_FIELD(String, text);
DTO_FIELD_INFO(map) { info->description = "Character index mapping between input and translated text. The map[i] value represents the index of the beginning of a character (in translated text) that translated from the character (in the input text) beginning at index i. All input indexes are measured in characters after input string is translated to UTF-16, and all output indexes are measured in characters before output string is translated from UTF-16."; }
DTO_FIELD(Vector<UInt32>, map);
};
}
#include OATPP_CODEGEN_END(DTO)

75
ZRColaWS/iconverter.hpp Normal file
View File

@ -0,0 +1,75 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#pragma once
#include <stdex/compat.hpp>
#include <iconv.h>
#include <cstring>
#include <stdexcept>
#include <string>
inline static std::runtime_error errno_error(_In_z_ const char *file, _In_ int line, _In_z_ const char *func)
{
int _errno = errno;
return std::runtime_error(
std::string(file) + ":" + std::to_string(line) +
std::string(func) + " error " + std::to_string(_errno) + ": " +
std::strerror(_errno));
}
template <typename T_from, typename T_to>
class iconverter
{
public:
iconverter(_In_z_ const char* from, _In_z_ const char* to)
{
m_handle = iconv_open(to, from);
if (m_handle == (iconv_t)-1)
throw errno_error(__FILE__, __LINE__, __FUNCTION__);
}
~iconverter()
{
iconv_close(m_handle);
}
std::basic_string<T_to> convert(_In_z_count_(count) const T_from* input, _In_ size_t count) const
{
T_to buf[0x100];
std::basic_string<T_to> result;
size_t inSize = sizeof(T_from) * count;
do {
T_to* output = &buf[0];
size_t outSize = sizeof(buf);
errno = 0;
iconv(m_handle, (char**)&input, &inSize, (char**)&output, &outSize);
if (errno)
throw errno_error(__FILE__, __LINE__, __FUNCTION__);
result.insert(result.end(), buf, (T_to*)((char*)buf + sizeof(buf) - outSize));
} while (inSize);
return result;
}
std::basic_string<T_to> convert(_In_ const std::basic_string<T_from>& input)
{
return convert(input.c_str(), input.length());
}
protected:
iconv_t m_handle;
};
class utf16toutf8 : public iconverter<char16_t, char>
{
public:
utf16toutf8() : iconverter("UTF-16LE", "UTF-8") {}
};
class utf8toutf16 : public iconverter<char, char16_t>
{
public:
utf8toutf16() : iconverter("UTF-8", "UTF-16LE") {}
};

27
ZRColaWS/stdlogger.hpp Normal file
View File

@ -0,0 +1,27 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#pragma once
#include <oatpp/core/base/Environment.hpp>
#include <iostream>
#include <mutex>
class StdLogger : public oatpp::base::Logger
{
private:
std::mutex m_lock;
public:
StdLogger() {}
void log(v_uint32 priority, const std::string& tag, const std::string& message) override
{
if (!isLogPriorityEnabled(priority))
return;
std::lock_guard<std::mutex> guard(m_lock);
(priority < oatpp::base::Logger::PRIORITY_W ? std::cout : std::cerr) << tag << ": " << message << std::endl;
}
};

176
ZRColaWS/zrcolaws.cpp Normal file
View File

@ -0,0 +1,176 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#include "appcomponent.hpp"
#include "controller.hpp"
#include "stdlogger.hpp"
#include "zrcolaws.hpp"
#include <oatpp-swagger/Controller.hpp>
#include <oatpp/core/base/CommandLineArguments.hpp>
#include <oatpp/network/Server.hpp>
#include <signal.h>
#include <fstream>
#include <iostream>
using namespace std;
using namespace ZRCola;
translation_db t_db;
transet_db ts_db;
// transeq_db tsq_db;
langchar_db lc_db;
language_db lang_db;
// character_db chr_db;
// chrcat_db cc_db;
// chrtag_db ct_db;
// tagname_db tn_db;
// highlight_db h_db;
static void load_database()
{
fstream dat(PREFIX "/share/zrcola/ZRCola.zrcdb", ios_base::in | ios_base::binary);
if (!dat.good())
throw runtime_error(PREFIX "/share/zrcola/ZRCola.zrcdb not found or cannot be opened.");
if (!stdex::idrec::find<recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN>(dat, ZRCOLA_DB_ID, sizeof(recordid_t)))
throw runtime_error(PREFIX "/share/zrcola/ZRCola.zrcdb is not a valid ZRCola database.");
recordsize_t size;
dat.read((char*)&size, sizeof(recordsize_t));
if (dat.good()) {
bool has_translation_data = false;
for (;;) {
recordid_t id;
if (!stdex::idrec::read_id(dat, id, size)) break;
if (id == translation_rec::id()) {
dat >> translation_rec(t_db);
if (dat.good()) {
has_translation_data = true;
} else {
OATPP_LOGE(__FUNCTION__, "Error reading translation data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
t_db.clear();
}
} else if (id == transet_rec::id()) {
dat >> transet_rec(ts_db);
if (!dat.good()) {
OATPP_LOGE(__FUNCTION__, "Error reading translation set data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
ts_db.clear();
}
// } else if (id == transeq_rec::id()) {
// dat >> transeq_rec(tsq_db);
// if (!dat.good()) {
// OATPP_LOGE(__FUNCTION__, "Error reading translation sequence data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
// tsq_db.clear();
// }
} else if (id == langchar_rec::id()) {
dat >> langchar_rec(lc_db);
if (!dat.good()) {
OATPP_LOGE(__FUNCTION__, "Error reading language character data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
lc_db.clear();
}
} else if (id == language_rec::id()) {
dat >> language_rec(lang_db);
if (!dat.good()) {
OATPP_LOGE(__FUNCTION__, "Error reading language character data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
lang_db.clear();
}
// } else if (id == character_rec::id()) {
// dat >> character_rec(chr_db);
// if (!dat.good()) {
// OATPP_LOGE(__FUNCTION__, "Error reading character data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
// chr_db.clear();
// }
// } else if (id == chrcat_rec::id()) {
// dat >> chrcat_rec(cc_db);
// if (!dat.good()) {
// OATPP_LOGE(__FUNCTION__, "Error reading character category data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
// cc_db.clear();
// }
// } else if (id == chrtag_rec::id()) {
// dat >> chrtag_rec(ct_db);
// if (!dat.good()) {
// OATPP_LOGE(__FUNCTION__, "Error reading character tag data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
// ct_db.clear();
// }
// } else if (id == tagname_rec::id()) {
// dat >> tagname_rec(tn_db);
// if (!dat.good()) {
// OATPP_LOGE(__FUNCTION__, "Error reading tag name data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
// tn_db.clear();
// }
// } else if (id == highlight_rec::id()) {
// dat >> highlight_rec(h_db);
// if (!dat.good()) {
// OATPP_LOGE(__FUNCTION__, "Error reading highlight data from " PREFIX "/share/zrcola/ZRCola.zrcdb.");
// h_db.clear();
// }
} else
stdex::idrec::ignore<recordsize_t, ZRCOLA_RECORD_ALIGN>(dat);
}
if (!has_translation_data)
throw runtime_error(PREFIX "/share/zrcola/ZRCola.zrcdb has no translation data.");
}
}
static void sig_handler(int s)
{
OATPP_LOGD(__FUNCTION__, "Caught signal %d", s);
OATPP_COMPONENT(std::shared_ptr<oatpp::network::Server>, server);
server->stop();
}
int main(int argc, const char* argv[])
{
auto logger = std::make_shared<StdLogger>();
oatpp::base::Environment::init(logger);
try {
{
oatpp::base::CommandLineArguments cmdArgs(argc, argv);
if (cmdArgs.hasArgument("-?") || cmdArgs.hasArgument("--help")) {
cerr << "ZRColaWS " << PRODUCT_VERSION_STR << " Copyright © 2022 Amebis" << endl;
cerr << endl;
cerr << argv[0] << " [--host <interface name>] [--port <port number>] [-4|-6]" << endl;
return 1;
}
load_database();
struct sigaction sigIntHandler;
sigIntHandler.sa_handler = sig_handler;
sigemptyset(&sigIntHandler.sa_mask);
sigIntHandler.sa_flags = 0;
sigaction(SIGINT, &sigIntHandler, NULL);
oatpp::String host = cmdArgs.getNamedArgumentValue("--host", "localhost");
v_uint16 port = oatpp::utils::conversion::strToInt32(cmdArgs.getNamedArgumentValue("--port", "54591"));
oatpp::network::Address::Family family = oatpp::network::Address::UNSPEC;
if (cmdArgs.hasArgument("-4"))
family = oatpp::network::Address::IP_4;
else if (cmdArgs.hasArgument("-6"))
family = oatpp::network::Address::IP_6;
AppComponent components({host, port, family});
OATPP_COMPONENT(std::shared_ptr<oatpp::web::server::HttpRouter>, router);
OATPP_COMPONENT(std::shared_ptr<oatpp::data::mapping::ObjectMapper>, objectMapper);
auto controller = std::make_shared<Controller>(objectMapper);
router->addController(controller);
auto swaggerController = oatpp::swagger::Controller::createShared(controller->getEndpoints());
router->addController(swaggerController);
OATPP_COMPONENT(std::shared_ptr<oatpp::network::ServerConnectionProvider>, connectionProvider);
OATPP_LOGI(__FUNCTION__, "Server " PRODUCT_VERSION_STR " starting on %s:%s",
connectionProvider->getProperty("host").getData(), connectionProvider->getProperty("port").getData());
OATPP_COMPONENT(std::shared_ptr<oatpp::network::Server>, server);
server->run();
OATPP_LOGI(__FUNCTION__, "Server stopped");
}
oatpp::base::Environment::destroy();
} catch (exception &ex) {
OATPP_LOGE(__FUNCTION__, "%s: %s", typeid(ex).name(), ex.what());
return 1;
}
return 0;
}

24
ZRColaWS/zrcolaws.hpp Normal file
View File

@ -0,0 +1,24 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#pragma once
#include "../include/version.h"
#include <zrcola/idrec.h>
#ifndef PREFIX
#define PREFIX "/usr/local"
#endif
extern ZRCola::translation_db t_db;
extern ZRCola::transet_db ts_db;
// extern ZRCola::transeq_db tsq_db;
extern ZRCola::langchar_db lc_db;
extern ZRCola::language_db lang_db;
// extern ZRCola::character_db chr_db;
// extern ZRCola::chrcat_db cc_db;
// extern ZRCola::chrtag_db ct_db;
// extern ZRCola::tagname_db tn_db;
// extern ZRCola::highlight_db h_db;

1
include/.gitignore vendored
View File

@ -1,2 +1 @@
/UpdaterKeypair.txt
/UpdaterKeyPrivate.bin

View File

@ -31,6 +31,7 @@
<DisableSpecificWarnings>4100;4505</DisableSpecificWarnings>
<SupportJustMyCode>false</SupportJustMyCode>
<EnablePREfast>true</EnablePREfast>
<LanguageStandard>stdcpp17</LanguageStandard>
</ClCompile>
<Link>
<OptimizeReferences>true</OptimizeReferences>

17
include/props.mak Normal file
View File

@ -0,0 +1,17 @@
ifeq ($(PREFIX),)
PREFIX := /usr/local
endif
CPPFLAGS := $(CPPFLAGS) -MMD -MP -DPREFIX='"$(PREFIX)"'
ifeq ($(CFG),Debug)
CPPFLAGS := $(CPPFLAGS) -D_DEBUG
CFLAGS := $(CFLAGS) -Og -g
CXXFLAGS := $(CXXFLAGS) -Og -g
else
CPPFLAGS := $(CPPFLAGS) -DNDEBUG
CFLAGS := $(CFLAGS) -O3 -fdata-sections -ffunction-sections
CXXFLAGS := $(CXXFLAGS) -O3 -fdata-sections -ffunction-sections
LDFLAGS := $(LDFLAGS) -Wl,--gc-sections
endif
OBJS := $(SRCS:%=%.o)
DEPS := $(OBJS:.o=.d)

5
include/targets.mak Normal file
View File

@ -0,0 +1,5 @@
%.h.gch: %.h
$(CXX) $(CPPFLAGS) $(CXXFLAGS) -x c++-header -o $@ -c $<
%.cpp.o: %.cpp
$(CXX) $(CPPFLAGS) $(CXXFLAGS) -o $@ -c $<

View File

@ -9,7 +9,7 @@
// Product version as a single DWORD
// Note: Used for version comparison within C/C++ code.
//
#define PRODUCT_VERSION 0x02060000
#define PRODUCT_VERSION 0x02070100
//
// Product version by components
@ -18,27 +18,27 @@
// separately.
//
#define PRODUCT_VERSION_MAJ 2
#define PRODUCT_VERSION_MIN 6
#define PRODUCT_VERSION_REV 0
#define PRODUCT_VERSION_MIN 7
#define PRODUCT_VERSION_REV 1
#define PRODUCT_VERSION_BUILD 0
//
// Human readable product version and build year for UI
//
#define PRODUCT_VERSION_STR "2.6"
#define PRODUCT_BUILD_YEAR_STR "2022"
#define PRODUCT_VERSION_STR "2.7.1"
#define PRODUCT_BUILD_YEAR_STR "2024"
//
// Numerical version presentation for ProductVersion propery in
// MSI packages (syntax: N.N[.N[.N]])
//
#define PRODUCT_VERSION_INST "2.6"
#define PRODUCT_VERSION_INST "2.7.1"
//
// The product code for ProductCode property in MSI packages
// Replace with new on every version change, regardless how minor it is.
//
#define PRODUCT_VERSION_GUID "{B7743708-2694-4BA7-8FC4-0797C071C4F8}"
#define PRODUCT_VERSION_GUID "{9423BEC3-3159-4130-8C3E-48D5DE24D48A}"
//
// The product vendor and application name for configuration keeping.

@ -1 +1 @@
Subproject commit 328646b2d9d7100afe9d2d0a25e2c656241bb25b
Subproject commit 6dead076a2e48e7f561c6e72e027c48ffcbb07be

View File

@ -0,0 +1,33 @@
CPPFLAGS := $(CPPFLAGS) -I../../stdex/include
SRCS := \
../src/character.cpp \
../src/common.cpp \
../src/highlight.cpp \
../src/language.cpp \
../src/mapping.cpp \
../src/pch.cpp \
../src/tag.cpp \
../src/translate.cpp
include ../../../include/props.mak
.PHONY: all
all: ../lib/libZRCola.a
../lib/libZRCola.a: ../src/pch.h.gch $(OBJS)
$(AR) $(ARFLAGS) $@ $(OBJS)
.PHONY: test
test: ../test/test
../test/test
../test/test: ../lib/libZRCola.a
$(CXX) $(CPPFLAGS) -I../../stdex/include -I../include $(CXXFLAGS) -L../lib -o $@ ../test/test.cpp -lstdc++ -lZRCola
.PHONY: clean
clean:
-rm -r ../src/*.{d,gch,o} ../lib/libZRCola.a ../test/*.d ../test/test
include ../../../include/targets.mak
-include $(DEPS)
-include ../test/test.d

View File

@ -66,7 +66,6 @@
<ClCompile Include="..\src\common.cpp" />
<ClCompile Include="..\src\highlight.cpp" />
<ClCompile Include="..\src\language.cpp" />
<ClCompile Include="..\src\mapping.cpp" />
<ClCompile Include="..\src\pch.cpp">
<PrecompiledHeader>Create</PrecompiledHeader>
</ClCompile>

View File

@ -14,9 +14,6 @@
<ClCompile Include="..\src\pch.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="..\src\mapping.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="..\src\translate.cpp">
<Filter>Source Files</Filter>
</ClCompile>

View File

@ -7,15 +7,15 @@
#include "common.h"
#include <stdex/idrec>
#include <assert.h>
#include <algorithm>
#include <istream>
#include <locale>
#include <map>
#include <ostream>
#include <vector>
#include <set>
#include <string>
#include <vector>
#pragma warning(push)
#pragma warning(disable: 4200)
@ -29,11 +29,16 @@ namespace ZRCola {
///
typedef double charrank_t;
inline bool ispua(_In_ wchar_t c)
inline bool ispua(_In_ char_t c)
{
return L'\ue000' <= c && c <= L'\uf8ff';
return u'\ue000' <= c && c <= u'\uf8ff';
}
#ifndef _WIN32
size_t wcslen(_In_z_ const char_t* str);
size_t wcsnlen(_In_z_count_(count) const char_t* str, _In_ size_t count);
#endif
#pragma pack(push)
#pragma pack(2)
///
@ -177,13 +182,13 @@ namespace ZRCola {
///
struct character {
public:
chrcatid_t cat; ///> Character category ID
chrcatid_t cat; ///> Character category ID
protected:
unsigned __int16 chr_to; ///< Character end in \c data
unsigned __int16 desc_to; ///< Character description end in \c data
unsigned __int16 rel_to; ///< Related characters end in \c data
wchar_t data[]; ///< Character, character description
uint16_t chr_to; ///< Character end in \c data
uint16_t desc_to; ///< Character description end in \c data
uint16_t rel_to; ///< Related characters end in \c data
char_t data[]; ///< Character, character description
private:
inline character(_In_ const character &other);
@ -202,47 +207,47 @@ namespace ZRCola {
/// \param[in] rel_len Number of UTF-16 characters in \p rel (including zero delimiters)
///
inline character(
_In_opt_z_count_(chr_len) const wchar_t *chr = NULL,
_In_opt_z_count_(chr_len) const char_t *chr = NULL,
_In_opt_ size_t chr_len = 0,
_In_opt_ chrcatid_t cat = chrcatid_t::blank,
_In_opt_z_count_(desc_len) const wchar_t *desc = NULL,
_In_opt_z_count_(desc_len) const char_t *desc = NULL,
_In_opt_ size_t desc_len = 0,
_In_opt_z_count_(rel_len) const wchar_t *rel = NULL,
_In_opt_z_count_(rel_len) const char_t *rel = NULL,
_In_opt_ size_t rel_len = 0)
{
this->cat = cat;
this->chr_to = static_cast<unsigned __int16>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(wchar_t)*chr_len);
this->desc_to = static_cast<unsigned __int16>(this->chr_to + desc_len);
if (desc && desc_len) memcpy(this->data + this->chr_to, desc, sizeof(wchar_t)*desc_len);
this->rel_to = static_cast<unsigned __int16>(this->desc_to + rel_len);
if (rel && rel_len) memcpy(this->data + this->desc_to, rel, sizeof(wchar_t)*rel_len);
this->chr_to = static_cast<uint16_t>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(char_t)*chr_len);
this->desc_to = static_cast<uint16_t>(this->chr_to + desc_len);
if (desc && desc_len) memcpy(this->data + this->chr_to, desc, sizeof(char_t)*desc_len);
this->rel_to = static_cast<uint16_t>(this->desc_to + rel_len);
if (rel && rel_len) memcpy(this->data + this->desc_to, rel, sizeof(char_t)*rel_len);
}
inline const wchar_t* chr () const { return data; };
inline wchar_t* chr () { return data; };
inline const wchar_t* chr_end() const { return data + chr_to; };
inline wchar_t* chr_end() { return data + chr_to; };
inline unsigned __int16 chr_len() const { return chr_to; };
inline const char_t* chr () const { return data; };
inline char_t* chr () { return data; };
inline const char_t* chr_end() const { return data + chr_to; };
inline char_t* chr_end() { return data + chr_to; };
inline uint16_t chr_len() const { return chr_to; };
inline const wchar_t* desc () const { return data + chr_to; };
inline wchar_t* desc () { return data + chr_to; };
inline const wchar_t* desc_end() const { return data + desc_to; };
inline wchar_t* desc_end() { return data + desc_to; };
inline unsigned __int16 desc_len() const { return desc_to - chr_to; };
inline const char_t* desc () const { return data + chr_to; };
inline char_t* desc () { return data + chr_to; };
inline const char_t* desc_end() const { return data + desc_to; };
inline char_t* desc_end() { return data + desc_to; };
inline uint16_t desc_len() const { return desc_to - chr_to; };
inline const wchar_t* rel () const { return data + desc_to; };
inline wchar_t* rel () { return data + desc_to; };
inline const wchar_t* rel_end() const { return data + rel_to; };
inline wchar_t* rel_end() { return data + rel_to; };
inline unsigned __int16 rel_len() const { return rel_to - desc_to; };
inline const char_t* rel () const { return data + desc_to; };
inline char_t* rel () { return data + desc_to; };
inline const char_t* rel_end() const { return data + rel_to; };
inline char_t* rel_end() { return data + rel_to; };
inline uint16_t rel_len() const { return rel_to - desc_to; };
};
#pragma pack(pop)
///
/// Character index
///
class indexChr : public index<unsigned __int16, unsigned __int32, character>
class indexChr : public index<uint16_t, uint32_t, character>
{
public:
///
@ -250,7 +255,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexChr(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, character>(h) {}
indexChr(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, character>(h) {}
///
/// Compares two characters by ID (for searching)
@ -272,9 +277,9 @@ namespace ZRCola {
}
} idxChr; ///< Character index
textindex<wchar_t, wchar_t, unsigned __int32> idxDsc; ///< Description index
textindex<wchar_t, wchar_t, unsigned __int32> idxDscSub; ///< Description index (sub-terms)
std::vector<unsigned __int16> data; ///< Character data
textindex<char_t, char_t, uint32_t> idxDsc; ///< Description index
textindex<char_t, char_t, uint32_t> idxDscSub; ///< Description index (sub-terms)
std::vector<uint16_t> data; ///< Character data
public:
///
@ -303,7 +308,7 @@ namespace ZRCola {
/// \param[in ] fn_abort Pointer to function to periodically test for search cancellation
/// \param[in ] cookie Cookie for \p fn_abort call
///
bool Search(_In_z_ const wchar_t *str, _In_ const std::set<chrcatid_t> &cats, _Inout_ std::map<std::wstring, charrank_t> &hits, _Inout_ std::map<std::wstring, charrank_t> &hits_sub, _In_opt_ bool (__cdecl *fn_abort)(void *cookie) = NULL, _In_opt_ void *cookie = NULL) const;
bool Search(_In_z_ const char_t *str, _In_ const std::set<chrcatid_t> &cats, _Inout_ std::map<string_t, charrank_t> &hits, _Inout_ std::map<string_t, charrank_t> &hits_sub, _In_opt_ bool (__cdecl *fn_abort)(void *cookie) = NULL, _In_opt_ void *cookie = NULL) const;
///
/// Get character category
@ -315,20 +320,99 @@ namespace ZRCola {
/// - Character category if character found
/// - `ZRCola::chrcatid_t::blank` otherwise
///
inline chrcatid_t GetCharCat(_In_z_count_(len) const wchar_t *chr, _In_ const size_t len) const
inline chrcatid_t GetCharCat(_In_z_count_(len) const char_t *chr, _In_ const size_t len) const
{
assert(len <= 0xffff);
std::unique_ptr<character> c((character*)new char[sizeof(character) + sizeof(wchar_t)*len]);
c->character::character(chr, len);
std::unique_ptr<character> c((character*)new char[sizeof(character) + sizeof(char_t)*len]);
new (c.get()) character(chr, len);
indexChr::size_type start;
return idxChr.find(*c, start) ? idxChr[start].cat : chrcatid_t::blank;
}
///
/// Writes character database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const character_db &db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write description index.
if (!stream.good()) return stream;
stream << db.idxDsc;
// Write sub-term description index.
if (!stream.good()) return stream;
stream << db.idxDscSub;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads character database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ character_db& db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read description index.
stream >> db.idxDsc;
if (!stream.good()) return stream;
// Read sub-term description index.
stream >> db.idxDscSub;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<character_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> character_rec;
///
/// Character category database
///
@ -341,12 +425,12 @@ namespace ZRCola {
///
struct chrcat {
public:
chrcatid_t cat; ///< Character category ID
unsigned __int16 rank; ///< Character category rank
chrcatid_t cat; ///< Character category ID
uint16_t rank; ///< Character category rank
protected:
unsigned __int16 name_to; ///< Character category name end in \c data
wchar_t data[]; ///< Character category name
uint16_t name_to; ///< Character category name end in \c data
char_t data[]; ///< Character category name
private:
inline chrcat(_In_ const chrcat &other);
@ -362,29 +446,29 @@ namespace ZRCola {
/// \param[in] name_len Number of UTF-16 characters in \p name
///
inline chrcat(
_In_opt_ chrcatid_t cat = chrcatid_t::blank,
_In_opt_ unsigned __int16 rank = 0,
_In_opt_z_count_(name_len) const wchar_t *name = NULL,
_In_opt_ size_t name_len = 0)
_In_opt_ chrcatid_t cat = chrcatid_t::blank,
_In_opt_ uint16_t rank = 0,
_In_opt_z_count_(name_len) const char_t *name = NULL,
_In_opt_ size_t name_len = 0)
{
this->cat = cat;
this->rank = rank;
this->name_to = static_cast<unsigned __int16>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(wchar_t)*name_len);
this->name_to = static_cast<uint16_t>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(char_t)*name_len);
}
inline const wchar_t* name () const { return data; };
inline wchar_t* name () { return data; };
inline const wchar_t* name_end() const { return data + name_to; };
inline wchar_t* name_end() { return data + name_to; };
inline unsigned __int16 name_len() const { return name_to; };
inline const char_t* name () const { return data; };
inline char_t* name () { return data; };
inline const char_t* name_end() const { return data + name_to; };
inline char_t* name_end() { return data + name_to; };
inline uint16_t name_len() const { return name_to; };
};
#pragma pack(pop)
///
/// Character category index
///
class indexChrCat : public index<unsigned __int16, unsigned __int32, chrcat>
class indexChrCat : public index<uint16_t, uint32_t, chrcat>
{
public:
///
@ -392,7 +476,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexChrCat(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, chrcat>(h) {}
indexChrCat(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, chrcat>(h) {}
///
/// Compares two character categories by ID (for searching)
@ -407,8 +491,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const chrcat &a, _In_ const chrcat &b) const
{
if (a.cat < b.cat) return -1;
else if (a.cat > b.cat) return 1;
if (a.cat < b.cat) return -1;
if (a.cat > b.cat) return 1;
return 0;
}
@ -417,7 +501,7 @@ namespace ZRCola {
///
/// Rank index
///
class indexRank : public index<unsigned __int16, unsigned __int32, chrcat>
class indexRank : public index<uint16_t, uint32_t, chrcat>
{
public:
///
@ -425,7 +509,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexRank(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, chrcat>(h) {}
indexRank(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, chrcat>(h) {}
///
/// Compares two character categories by ID (for searching)
@ -440,8 +524,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const chrcat &a, _In_ const chrcat &b) const
{
if (a.rank < b.rank) return -1;
else if (a.rank > b.rank) return +1;
if (a.rank < b.rank) return -1;
if (a.rank > b.rank) return +1;
return 0;
}
@ -462,19 +546,12 @@ namespace ZRCola {
if (a.rank < b.rank) return -1;
else if (a.rank > b.rank) return +1;
unsigned __int16
a_name_len = a.name_len(),
b_name_len = b.name_len();
int r = _wcsncoll(a.name(), b.name(), std::min<unsigned __int16>(a_name_len, b_name_len));
if (r != 0) return r;
if (a_name_len < b_name_len) return -1;
else if (a_name_len > b_name_len) return +1;
return 0;
auto &coll = std::use_facet<std::collate<char_t>>(std::locale());
return coll.compare(a.name(), a.name_end(), b.name(), b.name_end());
}
} idxRank; ///< Rank index
std::vector<unsigned __int16> data; ///< Character category data
std::vector<uint16_t> data; ///< Character category data
public:
///
@ -491,168 +568,81 @@ namespace ZRCola {
idxRank .clear();
data .clear();
}
///
/// Writes character category database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character category database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const chrcat_db& db)
{
// Write character category index.
if (stream.fail()) return stream;
stream << db.idxChrCat;
// Write rank index.
if (stream.fail()) return stream;
stream << db.idxRank;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads character category database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character category database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ chrcat_db& db)
{
// Read character category index.
stream >> db.idxChrCat;
if (!stream.good()) return stream;
// Read rank index.
stream >> db.idxRank;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<chrcat_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> chrcat_rec;
};
const ZRCola::recordid_t ZRCola::character_rec::id = *(ZRCola::recordid_t*)"CHR";
const ZRCola::recordid_t ZRCola::chrcat_rec ::id = *(ZRCola::recordid_t*)"CCT";
///
/// Reads character database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::character_db &db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read description index.
stream >> db.idxDsc;
if (!stream.good()) return stream;
// Read sub-term description index.
stream >> db.idxDscSub;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
///
/// Writes character database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::character_db &db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write description index.
if (!stream.good()) return stream;
stream << db.idxDsc;
// Write sub-term description index.
if (!stream.good()) return stream;
stream << db.idxDscSub;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Writes character category database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character category database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::chrcat_db &db)
{
// Write character category index.
if (stream.fail()) return stream;
stream << db.idxChrCat;
// Write rank index.
if (stream.fail()) return stream;
stream << db.idxRank;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads character category database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character category database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::chrcat_db &db)
{
// Read character category index.
stream >> db.idxChrCat;
if (!stream.good()) return stream;
// Read rank index.
stream >> db.idxRank;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
#pragma warning(pop)

View File

@ -9,8 +9,14 @@
#define _WINSOCKAPI_ // Prevent inclusion of winsock.h in windows.h.
#include <Windows.h>
#endif
#include <sal.h>
#include <stdex/compat.hpp>
#include <stdex/mapping.hpp>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <wchar.h>
#include <istream>
#include <memory>
#include <ostream>
#include <utility>
#include <vector>
@ -29,12 +35,39 @@
///
/// Database IDs
///
#define ZRCOLA_DB_ID (*(ZRCola::recordid_t*)"ZRC")
#define ZRCOLA_DB_ID 0x43525a // "ZRC"
#ifdef __GNUC__
#ifdef __i386__
#define __cdecl __attribute__((__cdecl__))
#else
#define __cdecl
#endif
#endif
namespace ZRCola {
typedef unsigned __int32 recordid_t;
typedef unsigned __int32 recordsize_t;
typedef uint32_t recordid_t;
typedef uint32_t recordsize_t;
///
/// ZRCola database character type
///
#ifdef _WIN32
typedef wchar_t char_t;
#else
typedef char16_t char_t;
#endif
///
/// ZRCola database string type
///
#ifdef _WIN32
typedef std::wstring string_t;
#else
typedef std::u16string string_t;
#endif
#pragma pack(push)
@ -60,7 +93,7 @@ namespace ZRCola {
struct langid_t {
char data[4];
inline langid_t& operator=(const langid_t &src)
inline langid_t& operator=(_In_ const langid_t &src)
{
data[0] = src.data[0];
data[1] = src.data[1];
@ -69,7 +102,7 @@ namespace ZRCola {
return *this;
}
inline langid_t& operator=(const char *src)
inline langid_t& operator=(_In_z_ const char *src)
{
data[3] = (
data[2] = (
@ -210,9 +243,14 @@ namespace ZRCola {
///
/// Memory index
///
template <class T_data, class T_idx = unsigned __int32, class T_el = T_data>
template <class T_data, class T_idx = uint32_t, class T_el = T_data>
class index : public std::vector<T_idx>
{
typedef std::vector<T_idx> base_t;
public:
typedef size_t size_type;
protected:
std::vector<T_data> &host; ///< Reference to host data
@ -234,7 +272,7 @@ namespace ZRCola {
///
inline const T_el& at(size_type pos) const
{
return *reinterpret_cast<const T_el*>(&host[std::vector<T_idx>::at(pos)]);
return *reinterpret_cast<const T_el*>(&host[base_t::at(pos)]);
}
@ -247,7 +285,7 @@ namespace ZRCola {
///
inline T_el& at(size_type pos)
{
return *reinterpret_cast<T_el*>(&host[std::vector<T_idx>::at(pos)]);
return *reinterpret_cast<T_el*>(&host[base_t::at(pos)]);
}
@ -260,7 +298,7 @@ namespace ZRCola {
///
inline const T_el& operator[](size_type pos) const
{
return *reinterpret_cast<const T_el*>(&host[std::vector<T_idx>::operator[](pos)]);
return *reinterpret_cast<const T_el*>(&host[base_t::operator[](pos)]);
}
@ -273,7 +311,7 @@ namespace ZRCola {
///
inline T_el& operator[](size_type pos)
{
return *reinterpret_cast<T_el*>(&host[std::vector<T_idx>::operator[](pos)]);
return *reinterpret_cast<T_el*>(&host[base_t::operator[](pos)]);
}
@ -282,7 +320,7 @@ namespace ZRCola {
///
inline void sort()
{
qsort_s(data(), size(), sizeof(T_idx), compare_s, this);
qsort_s(base_t::data(), base_t::size(), sizeof(T_idx), compare_s, this);
}
@ -333,21 +371,21 @@ namespace ZRCola {
bool find(_In_ const T_el &el, _Out_ size_type &start, _Out_ size_type &end) const
{
// Start with the full search area.
for (start = 0, end = size(); start < end; ) {
size_type m = (start + end) / 2;
for (start = 0, end = base_t::size(); start < end; ) {
auto m = (start + end) / 2;
int r = compare(el, at(m));
if (r < 0) end = m;
else if (r > 0) start = m + 1;
else {
// Narrow the search area on the left to start at the first element in the run.
for (size_type end2 = m; start < end2;) {
size_type m2 = (start + end2) / 2;
for (auto end2 = m; start < end2;) {
auto m2 = (start + end2) / 2;
if (compare(el, at(m2)) <= 0) end2 = m2; else start = m2 + 1;
}
// Narrow the search area on the right to end at the first element not in the run.
for (size_type start2 = m + 1; start2 < end;) {
size_type m2 = (start2 + end) / 2;
for (auto start2 = m + 1; start2 < end;) {
auto m2 = (start2 + end) / 2;
if (0 <= compare(el, at(m2))) start2 = m2 + 1; else end = m2;
}
@ -372,14 +410,14 @@ namespace ZRCola {
{
// Start with the full search area.
size_t end;
for (start = 0, end = size(); start < end; ) {
size_type m = (start + end) / 2;
for (start = 0, end = base_t::size(); start < end; ) {
auto m = (start + end) / 2;
int r = compare(el, at(m));
if (r < 0) end = m;
else if (r > 0) start = m + 1;
else {
// Narrow the search area on the left to start at the first element in the run.
for (size_type end2 = m; start < end2;) {
for (auto end2 = m; start < end2;) {
m = (start + end2) / 2;
if (compare(el, at(m)) <= 0) end2 = m; else start = m + 1;
}
@ -391,6 +429,68 @@ namespace ZRCola {
return false;
}
///
/// Writes index to a stream
///
/// \param[in] stream Output stream
/// \param[in] idx Index
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const index& idx)
{
// Write index count.
auto idx_count = idx.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)idx_count;
stream.write((const char*)&count, sizeof(count));
// Write index data.
if (stream.fail()) return stream;
stream.write((const char*)idx.data(), sizeof(T_idx) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads index from a stream
///
/// \param[in] stream Input stream
/// \param[out] idx Index
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ index& idx)
{
uint32_t count;
// Read index count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) {
idx.clear();
return stream;
}
if (count) {
// Read index data.
idx.resize(count);
stream.read((char*)idx.data(), sizeof(T_idx) * static_cast<std::streamsize>(count));
}
else
idx.clear();
return stream;
}
private:
static int __cdecl compare_s(void *p, const void *a, const void *b)
{
@ -406,11 +506,13 @@ namespace ZRCola {
///
/// Memory text index
///
template <class T_key, class T_val, class T_idx = unsigned __int32>
template <class T_key, class T_val, class T_idx = uint32_t>
class textindex : public std::vector< mappair_t<T_idx> >
{
public:
typedef std::vector< mappair_t<T_idx> > base_t;
public:
typedef size_t size_type;
std::vector<T_key> keys; ///< Key data
std::vector<T_val> values; ///< Index values
@ -446,15 +548,15 @@ namespace ZRCola {
///
_Success_(return) bool find(_In_count_(key_len) const T_key *key, _In_ size_t key_len, _Out_ const T_val **val, _Out_ size_t *val_len) const
{
for (size_type start = 0, end = size(); start < end; ) {
size_type m = (start + end) / 2;
for (size_type start = 0, end = base_t::size(); start < end; ) {
auto m = (start + end) / 2;
int r = compare(key, key_len, m);
if (r < 0) end = m;
else if (r > 0) start = m + 1;
else {
// Get values at position m.
start = base_t::at(m ).idx_val;
*val_len = (m < size() ? base_t::at(m + 1).idx_val : values.size()) - start;
start = base_t::at(m ).idx_val;
*val_len = (m < base_t::size() ? base_t::at(m + 1).idx_val : values.size()) - start;
*val = &values.at(start);
return true;
}
@ -463,15 +565,145 @@ namespace ZRCola {
return false;
}
///
/// Writes text index to a stream
///
/// \param[in] stream Output stream
/// \param[in] idx Text index
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const textindex& idx)
{
uint32_t count;
// Write index count.
auto idx_count = idx.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
count = (uint32_t)idx_count;
stream.write((const char*)&count, sizeof(count));
// Write index data.
if (stream.fail()) return stream;
auto idx_data = idx.data();
stream.write((const char*)idx_data, sizeof(*idx_data) * static_cast<std::streamsize>(count));
// Write key count.
auto key_count = idx.keys.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
count = (uint32_t)key_count;
stream.write((const char*)&count, sizeof(count));
// Write key data.
if (stream.fail()) return stream;
auto idx_keys_data = idx.keys.data();
stream.write((const char*)idx_keys_data, sizeof(*idx_keys_data) * static_cast<std::streamsize>(count));
// Write value count.
auto value_count = idx.values.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
count = (uint32_t)value_count;
stream.write((const char*)&count, sizeof(count));
// Write value data.
if (stream.fail()) return stream;
auto idx_values_data = idx.values.data();
stream.write((const char*)idx_values_data, sizeof(*idx_values_data) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads text index from a stream
///
/// \param[in] stream Input stream
/// \param[out] idx Text index
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ textindex& idx)
{
uint32_t count;
// Read text index count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) {
idx.clear();
return stream;
}
if (count) {
// Read text index.
idx.resize(count);
auto p = idx.data();
stream.read((char*)p, sizeof(*p) * static_cast<std::streamsize>(count));
if (!stream.good()) return stream;
}
else
idx.clear();
// Read keys count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read keys.
idx.keys.resize(count);
auto p = idx.keys.data();
stream.read((char*)p, sizeof(*p) * static_cast<std::streamsize>(count));
if (!stream.good()) return stream;
}
else
idx.keys.clear();
// Read value count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read values.
idx.values.resize(count);
auto p = idx.values.data();
stream.read((char*)p, sizeof(*p) * static_cast<std::streamsize>(count));
}
else
idx.values.clear();
return stream;
}
protected:
inline int compare(_In_count_(key_len) const T_key *key, _In_ size_t key_len, size_type pos) const
{
// Get key at position pos.
size_type pos_next = pos + 1;
auto pos_next = pos + 1;
size_t
start = base_t::at(pos ).idx_key,
key2_len = (pos_next < size() ? base_t::at(pos_next).idx_key : keys.size()) - start;
std::vector<T_key>::const_pointer key2 = &keys.at(start);
start = base_t::at(pos ).idx_key,
key2_len = (pos_next < base_t::size() ? base_t::at(pos_next).idx_key : keys.size()) - start;
auto key2 = &keys.at(start);
// Compare keys.
int r = memcmp(key, key2, sizeof(T_key)*std::min<size_t>(key_len, key2_len));
@ -483,57 +715,15 @@ namespace ZRCola {
}
};
///
/// Source-destination index transformation mapping
///
class __declspec(novtable) mapping {
public:
size_t src; ///< Character index in source string
size_t dst; ///< Character index in destination string
inline mapping() : src(0), dst(0) {};
inline mapping(_In_ size_t s, _In_ size_t d) : src(s), dst(d) {}
///
/// Reverses source and destination indexes
///
inline void invert() { size_t tmp = src; src = dst; dst = tmp; }
};
using mapping = stdex::mapping<size_t>;
///
/// A vector for destination-source index transformation mapping
///
class mapping_vector : public std::vector<mapping> {
public:
///
/// Transforms character index of destination to source
///
/// \param[in] decmp Character index in destination string
///
/// \returns Character index in source string
///
size_t to_src(_In_ size_t dst) const;
///
/// Transforms source index to destination index
///
/// \param[in] cmp Character index in source string
///
/// \returns Character index in destination string
///
size_t to_dst(_In_ size_t src) const;
///
/// Reverses source and destination indexes
///
inline void invert()
{
for (iterator i = begin(), iEnd = end(); i != iEnd; ++i)
i->invert();
}
};
using mapping_vector = std::vector<mapping>;
///
/// Binary compares two strings
@ -552,16 +742,7 @@ namespace ZRCola {
/// The function does not treat \\0 characters as terminators for performance reasons.
/// Therefore \p count_a and \p count_b must represent exact string lengths.
///
inline int CompareString(_In_ const wchar_t *str_a, _In_ size_t count_a, _In_ const wchar_t *str_b, _In_ size_t count_b)
{
for (size_t i = 0; ; i++) {
if (i >= count_a && i >= count_b) return 0;
else if (i >= count_a && i < count_b) return -1;
else if (i < count_a && i >= count_b) return +1;
else if (str_a[i] < str_b[i]) return -1;
else if (str_a[i] > str_b[i]) return +1;
}
}
int CompareString(_In_ const char_t* str_a, _In_ size_t count_a, _In_ const char_t* str_b, _In_ size_t count_b);
///
/// Generates and returns Unicode representation of the string using hexadecimal codes.
@ -570,21 +751,7 @@ namespace ZRCola {
/// \param[in] count Number of characters in string \p str
/// \param[in] sep Separator
///
inline std::string GetUnicodeDumpA(_In_ const wchar_t *str, _In_ size_t count, _In_z_ const char *sep = "+")
{
std::string out;
size_t dump_len_max = strlen(sep) + 4 + 1;
char *dump;
std::unique_ptr<char> dump_obj(dump = new char[dump_len_max]);
if (count && str[0]) {
size_t i = 0;
out.insert(out.end(), dump, dump + _snprintf(dump, dump_len_max, "%04X", str[i++]));
while (i < count && str[i])
out.insert(out.end(), dump, dump + _snprintf(dump, dump_len_max, "%s%04X", sep, str[i++]));
}
return out;
}
std::string GetUnicodeDumpA(_In_z_count_(count) const char_t* str, _In_ size_t count, _In_z_ const char* sep = "+");
///
/// Generates and returns Unicode representation of the string using hexadecimal codes.
@ -593,21 +760,7 @@ namespace ZRCola {
/// \param[in] count Number of characters in string \p str
/// \param[in] sep Separator
///
inline std::wstring GetUnicodeDumpW(_In_ const wchar_t *str, _In_ size_t count, _In_z_ const wchar_t *sep = L"+")
{
std::wstring out;
size_t dump_len_max = wcslen(sep) + 4 + 1;
wchar_t *dump;
std::unique_ptr<wchar_t> dump_obj(dump = new wchar_t[dump_len_max]);
if (count && str[0]) {
size_t i = 0;
out.insert(out.end(), dump, dump + _snwprintf(dump, dump_len_max, L"%04X", str[i++]));
while (i < count && str[i])
out.insert(out.end(), dump, dump + _snwprintf(dump, dump_len_max, L"%s%04X", sep, str[i++]));
}
return out;
}
std::wstring GetUnicodeDumpW(_In_z_count_(count) const char_t* str, _In_ size_t count, _In_z_ const wchar_t* sep = L"+");
#ifdef _UNICODE
#define GetUnicodeDump GetUnicodeDumpW
@ -616,190 +769,4 @@ namespace ZRCola {
#endif
};
///
/// Writes index to a stream
///
/// \param[in] stream Output stream
/// \param[in] idx Index
///
/// \returns The stream \p stream
///
template <class T_data, class T_idx, class T_el>
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::index<T_data, T_idx, T_el> &idx)
{
// Write index count.
auto idx_count = idx.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)idx_count;
stream.write((const char*)&count, sizeof(count));
// Write index data.
if (stream.fail()) return stream;
stream.write((const char*)idx.data(), sizeof(T_idx)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads index from a stream
///
/// \param[in] stream Input stream
/// \param[out] idx Index
///
/// \returns The stream \p stream
///
template <class T_data, class T_idx, class T_el>
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::index<T_data, T_idx, T_el> &idx)
{
unsigned __int32 count;
// Read index count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) {
idx.clear();
return stream;
}
if (count) {
// Read index data.
idx.resize(count);
stream.read((char*)idx.data(), sizeof(T_idx)*static_cast<std::streamsize>(count));
} else
idx.clear();
return stream;
}
///
/// Writes text index to a stream
///
/// \param[in] stream Output stream
/// \param[in] idx Text index
///
/// \returns The stream \p stream
///
template <class T_key, class T_val, class T_idx>
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::textindex<T_key, T_val, T_idx> &idx)
{
unsigned __int32 count;
// Write index count.
auto idx_count = idx.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
count = (unsigned __int32)idx_count;
stream.write((const char*)&count, sizeof(count));
// Write index data.
if (stream.fail()) return stream;
stream.write((const char*)idx.data(), sizeof(ZRCola::textindex<T_key, T_val, T_idx>::value_type)*static_cast<std::streamsize>(count));
// Write key count.
auto key_count = idx.keys.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
count = (unsigned __int32)key_count;
stream.write((const char*)&count, sizeof(count));
// Write key data.
if (stream.fail()) return stream;
stream.write((const char*)idx.keys.data(), sizeof(std::vector<T_key>::value_type)*static_cast<std::streamsize>(count));
// Write value count.
auto value_count = idx.values.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (idx_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
count = (unsigned __int32)value_count;
stream.write((const char*)&count, sizeof(count));
// Write value data.
if (stream.fail()) return stream;
stream.write((const char*)idx.values.data(), sizeof(std::vector<T_val>::value_type)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads text index from a stream
///
/// \param[in] stream Input stream
/// \param[out] idx Text index
///
/// \returns The stream \p stream
///
template <class T_key, class T_val, class T_idx>
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::textindex<T_key, T_val, T_idx> &idx)
{
unsigned __int32 count;
// Read text index count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) {
idx.clear();
return stream;
}
if (count) {
// Read text index.
idx.resize(count);
stream.read((char*)idx.data(), sizeof(ZRCola::textindex<T_key, T_val, T_idx>::value_type)*static_cast<std::streamsize>(count));
if (!stream.good()) return stream;
} else
idx.clear();
// Read keys count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read keys.
idx.keys.resize(count);
stream.read((char*)idx.keys.data(), sizeof(std::vector<T_key>::value_type)*static_cast<std::streamsize>(count));
if (!stream.good()) return stream;
} else
idx.keys.clear();
// Read value count.
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read values.
idx.values.resize(count);
stream.read((char*)idx.values.data(), sizeof(std::vector<T_val>::value_type)*static_cast<std::streamsize>(count));
} else
idx.values.clear();
return stream;
}
#pragma warning(pop)

View File

@ -7,8 +7,6 @@
#include "common.h"
#include <stdex/idrec>
#include <functional>
#pragma warning(push)
@ -29,7 +27,7 @@ namespace ZRCola {
///
/// Highlight set ID
///
typedef unsigned __int16 hlghtsetid_t;
typedef uint16_t hlghtsetid_t;
///
/// Highlight database
@ -43,11 +41,11 @@ namespace ZRCola {
///
struct highlight {
public:
hlghtsetid_t set; ///< Highlight set ID
hlghtsetid_t set; ///< Highlight set ID
protected:
unsigned __int16 chr_to; ///< Character end in \c data
wchar_t data[]; ///< Character
uint16_t chr_to; ///< Character end in \c data
char_t data[]; ///< Character
private:
inline highlight(_In_ const highlight &other);
@ -63,21 +61,21 @@ namespace ZRCola {
///
inline highlight(
_In_opt_ hlghtsetid_t set = 0,
_In_opt_z_count_(chr_len) const wchar_t *chr = NULL,
_In_opt_z_count_(chr_len) const char_t *chr = NULL,
_In_opt_ size_t chr_len = 0)
{
this->set = set;
this->chr_to = static_cast<unsigned __int16>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(wchar_t)*chr_len);
this->chr_to = static_cast<uint16_t>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(char_t)*chr_len);
}
inline const wchar_t* chr () const { return data; };
inline wchar_t* chr () { return data; };
inline const wchar_t* chr_end() const { return data + chr_to; };
inline wchar_t* chr_end() { return data + chr_to; };
inline unsigned __int16 chr_len() const { return chr_to; };
inline const char_t* chr () const { return data; };
inline char_t* chr () { return data; };
inline const char_t* chr_end() const { return data + chr_to; };
inline char_t* chr_end() { return data + chr_to; };
inline uint16_t chr_len() const { return chr_to; };
inline wchar_t chr_at(_In_ size_t i) const
inline char_t chr_at(_In_ size_t i) const
{
return i < chr_to ? data[i] : 0;
}
@ -87,7 +85,7 @@ namespace ZRCola {
///
/// Highlight index
///
class indexChr : public index<unsigned __int16, unsigned __int32, highlight>
class indexChr : public index<uint16_t, uint32_t, highlight>
{
public:
///
@ -95,7 +93,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexChr(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, highlight>(h) {}
indexChr(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, highlight>(h) {}
///
/// Compares two highlights by string (for searching)
@ -113,8 +111,8 @@ namespace ZRCola {
int r = ZRCola::CompareString(a.chr(), a.chr_len(), b.chr(), b.chr_len());
if (r != 0) return r;
if (a.set < b.set) return -1;
else if (a.set > b.set) return +1;
if (a.set < b.set) return -1;
if (a.set > b.set) return +1;
return 0;
}
@ -138,7 +136,7 @@ namespace ZRCola {
} idxChr; ///< Highlight index
std::vector<unsigned __int16> data; ///< Highlight data
std::vector<uint16_t> data; ///< Highlight data
public:
///
@ -162,79 +160,73 @@ namespace ZRCola {
/// \param[in] inputMax Length of the input string in characters. Can be (size_t)-1 if \p input is zero terminated.
/// \param[in] callback Function to be called on highlight switch
///
void Highlight(_In_z_count_(inputMax) const wchar_t* input, _In_ size_t inputMax, _In_ std::function<void (hlghtsetid_t set, size_t start, size_t end)> callback) const;
void Highlight(_In_z_count_(inputMax) const char_t* input, _In_ size_t inputMax, _In_ std::function<void (hlghtsetid_t set, size_t start, size_t end)> callback) const;
///
/// Writes highlight database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Highlight database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const highlight_db& db)
{
// Write highlight index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads highlight database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Highlight database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ highlight_db& db)
{
// Read highlight index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<highlight_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> highlight_rec;
};
const ZRCola::recordid_t ZRCola::highlight_rec::id = *(ZRCola::recordid_t*)"HGH";
///
/// Writes highlight database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Highlight database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::highlight_db &db)
{
// Write highlight index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads highlight database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Highlight database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::highlight_db &db)
{
// Read highlight index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
#pragma warning(pop)

View File

@ -0,0 +1,27 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2022 Amebis
*/
#pragma once
#include "character.h"
#include "highlight.h"
#include "language.h"
#include "tag.h"
#include "translate.h"
#include <stdex/idrec.hpp>
namespace ZRCola {
typedef stdex::idrec::record<character_db, recordid_t, 0x524843 /*"CHR"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> character_rec;
typedef stdex::idrec::record<chrcat_db, recordid_t, 0x544343 /*"CCT"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> chrcat_rec;
typedef stdex::idrec::record<highlight_db, recordid_t, 0x484748 /*"HGH"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> highlight_rec;
typedef stdex::idrec::record<langchar_db, recordid_t, 0x432d4c /*"L-C"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> langchar_rec;
typedef stdex::idrec::record<language_db, recordid_t, 0x474e4c /*"LNG"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> language_rec;
typedef stdex::idrec::record<chrtag_db, recordid_t, 0x542d43 /*"C-T"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> chrtag_rec;
typedef stdex::idrec::record<tagname_db, recordid_t, 0x4e4754 /*"TGN"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> tagname_rec;
typedef stdex::idrec::record<translation_db, recordid_t, 0x4e5254 /*"TRN"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> translation_rec;
typedef stdex::idrec::record<transet_db, recordid_t, 0x455354 /*"TSE"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> transet_rec;
typedef stdex::idrec::record<transeq_db, recordid_t, 0x515354 /*"TSQ"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> transeq_rec;
}

View File

@ -7,7 +7,6 @@
#include "common.h"
#include <stdex/idrec>
#include <istream>
#include <ostream>
#include <vector>
@ -32,11 +31,11 @@ namespace ZRCola {
///
struct langchar {
public:
langid_t lang; ///< Language ID
langid_t lang; ///< Language ID
protected:
unsigned __int16 chr_to; ///< Character end in \c data
wchar_t data[]; ///< Character
uint16_t chr_to; ///< Character end in \c data
char_t data[]; ///< Character
private:
inline langchar(_In_ const langchar &other);
@ -52,26 +51,26 @@ namespace ZRCola {
///
inline langchar(
_In_opt_ langid_t lang = langid_t::blank,
_In_opt_z_count_(chr_len) const wchar_t *chr = NULL,
_In_opt_z_count_(chr_len) const char_t *chr = NULL,
_In_opt_ size_t chr_len = 0)
{
this->lang = lang;
this->chr_to = static_cast<unsigned __int16>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(wchar_t)*chr_len);
this->chr_to = static_cast<uint16_t>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(char_t)*chr_len);
}
inline const wchar_t* chr () const { return data; };
inline wchar_t* chr () { return data; };
inline const wchar_t* chr_end() const { return data + chr_to; };
inline wchar_t* chr_end() { return data + chr_to; };
inline unsigned __int16 chr_len() const { return chr_to; };
inline const char_t* chr () const { return data; };
inline char_t* chr () { return data; };
inline const char_t* chr_end() const { return data + chr_to; };
inline char_t* chr_end() { return data + chr_to; };
inline uint16_t chr_len() const { return chr_to; };
};
#pragma pack(pop)
///
/// Character index
///
class indexChr : public index<unsigned __int16, unsigned __int32, langchar>
class indexChr : public index<uint16_t, uint32_t, langchar>
{
public:
///
@ -79,7 +78,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexChr(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, langchar>(h) {}
indexChr(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, langchar>(h) {}
///
/// Compares two characters by ID (for searching)
@ -97,8 +96,8 @@ namespace ZRCola {
int r = ZRCola::CompareString(a.chr(), a.chr_len(), b.chr(), b.chr_len());
if (r != 0) return r;
if (a.lang < b.lang) return -1;
else if (a.lang > b.lang) return 1;
if (a.lang < b.lang) return -1;
if (a.lang > b.lang) return 1;
return 0;
}
@ -109,7 +108,7 @@ namespace ZRCola {
///
/// Language Index
///
class indexLang : public index<unsigned __int16, unsigned __int32, langchar>
class indexLang : public index<uint16_t, uint32_t, langchar>
{
public:
///
@ -117,7 +116,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexLang(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, langchar>(h) {}
indexLang(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, langchar>(h) {}
///
/// Compares two languages by ID (for searching)
@ -132,8 +131,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const langchar &a, _In_ const langchar &b) const
{
if (a.lang < b.lang) return -1;
else if (a.lang > b.lang) return 1;
if (a.lang < b.lang) return -1;
if (a.lang > b.lang) return 1;
int r = ZRCola::CompareString(a.chr, a.chr_len(), b.chr(), b.chr_len());
if (r != 0) return r;
@ -143,7 +142,7 @@ namespace ZRCola {
} idxLang; ///< Language index
#endif
std::vector<unsigned __int16> data; ///< Character data
std::vector<uint16_t> data; ///< Character data
public:
///
@ -177,13 +176,88 @@ namespace ZRCola {
/// \returns
/// - \c true when character is used in language
/// - \c false otherwise
bool IsLocalCharacter(_In_ const wchar_t *chr, _In_ const wchar_t *chr_end, _In_ langid_t lang) const;
bool IsLocalCharacter(_In_ const char_t *chr, _In_ const char_t *chr_end, _In_ langid_t lang) const;
///
/// Writes language character database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Language character database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const langchar_db& db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
#ifdef ZRCOLA_LANGCHAR_LANG_IDX
// Write language index.
if (stream.fail()) return stream;
stream << db.idxLang;
#endif
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads language character database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Language character database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ langchar_db& db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
#ifdef ZRCOLA_LANGCHAR_LANG_IDX
// Read language index.
stream >> db.idxLang;
if (!stream.good()) return stream;
#endif
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<langchar_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> langchar_rec;
///
/// Language database
///
@ -196,11 +270,11 @@ namespace ZRCola {
///
struct language {
public:
langid_t lang; ///< Language ID
langid_t lang; ///< Language ID
protected:
unsigned __int16 name_to; ///< Language name end in \c data
wchar_t data[]; ///< Language name
uint16_t name_to; ///< Language name end in \c data
char_t data[]; ///< Language name
private:
inline language(_In_ const language &other);
@ -216,26 +290,26 @@ namespace ZRCola {
///
inline language(
_In_opt_ langid_t lang = langid_t::blank,
_In_opt_z_count_(name_len) const wchar_t *name = NULL,
_In_opt_z_count_(name_len) const char_t *name = NULL,
_In_opt_ size_t name_len = 0)
{
this->lang = lang;
this->name_to = static_cast<unsigned __int16>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(wchar_t)*name_len);
this->name_to = static_cast<uint16_t>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(char_t)*name_len);
}
inline const wchar_t* name () const { return data; };
inline wchar_t* name () { return data; };
inline const wchar_t* name_end() const { return data + name_to; };
inline wchar_t* name_end() { return data + name_to; };
inline unsigned __int16 name_len() const { return name_to; };
inline const char_t* name () const { return data; };
inline char_t* name () { return data; };
inline const char_t* name_end() const { return data + name_to; };
inline char_t* name_end() { return data + name_to; };
inline uint16_t name_len() const { return name_to; };
};
#pragma pack(pop)
///
/// Language index
///
class indexLang : public index<unsigned __int16, unsigned __int32, language>
class indexLang : public index<uint16_t, uint32_t, language>
{
public:
///
@ -243,7 +317,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexLang(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, language>(h) {}
indexLang(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, language>(h) {}
///
/// Compares two languages by ID (for searching)
@ -258,14 +332,14 @@ namespace ZRCola {
///
virtual int compare(_In_ const language &a, _In_ const language &b) const
{
if (a.lang < b.lang) return -1;
else if (a.lang > b.lang) return 1;
if (a.lang < b.lang) return -1;
if (a.lang > b.lang) return 1;
return 0;
}
} idxLang; ///< Language index
std::vector<unsigned __int16> data; ///< Language data
std::vector<uint16_t> data; ///< Language data
public:
///
@ -281,156 +355,73 @@ namespace ZRCola {
idxLang.clear();
data .clear();
}
///
/// Writes language database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Language database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const language_db& db)
{
// Write language index.
if (stream.fail()) return stream;
stream << db.idxLang;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads language database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Language database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ language_db& db)
{
// Read language index.
stream >> db.idxLang;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<language_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> language_rec;
};
const ZRCola::recordid_t ZRCola::langchar_rec::id = *(ZRCola::recordid_t*)"L-C";
const ZRCola::recordid_t ZRCola::language_rec::id = *(ZRCola::recordid_t*)"LNG";
///
/// Writes language character database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Language character database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::langchar_db &db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
#ifdef ZRCOLA_LANGCHAR_LANG_IDX
// Write language index.
if (stream.fail()) return stream;
stream << db.idxLang;
#endif
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads language character database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Language character database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::langchar_db &db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
#ifdef ZRCOLA_LANGCHAR_LANG_IDX
// Read language index.
stream >> db.idxLang;
if (!stream.good()) return stream;
#endif
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
///
/// Writes language database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Language database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::language_db &db)
{
// Write language index.
if (stream.fail()) return stream;
stream << db.idxLang;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads language database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Language database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::language_db &db)
{
// Read language index.
stream >> db.idxLang;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
#pragma warning(pop)

View File

@ -7,7 +7,6 @@
#include "common.h"
#include <stdex/idrec>
#include <assert.h>
#include <istream>
#include <ostream>
@ -21,7 +20,7 @@
namespace ZRCola {
typedef unsigned __int16 tagid_t;
typedef uint16_t tagid_t;
///
/// Character Tag Database
@ -35,11 +34,11 @@ namespace ZRCola {
///
struct chrtag {
public:
tagid_t tag; ///< Tag ID
tagid_t tag; ///< Tag ID
protected:
unsigned __int16 chr_to; ///< Character end in \c data
wchar_t data[]; ///< Character
uint16_t chr_to; ///< Character end in \c data
char_t data[]; ///< Character
private:
inline chrtag(_In_ const chrtag &other);
@ -54,27 +53,27 @@ namespace ZRCola {
/// \param[in] tag Tag
///
inline chrtag(
_In_opt_z_count_(chr_len) const wchar_t *chr = NULL,
_In_opt_z_count_(chr_len) const char_t *chr = NULL,
_In_opt_ size_t chr_len = 0,
_In_opt_ tagid_t tag = 0)
{
this->tag = tag;
this->chr_to = static_cast<unsigned __int16>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(wchar_t)*chr_len);
this->chr_to = static_cast<uint16_t>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(char_t)*chr_len);
}
inline const wchar_t* chr () const { return data; };
inline wchar_t* chr () { return data; };
inline const wchar_t* chr_end() const { return data + chr_to; };
inline wchar_t* chr_end() { return data + chr_to; };
inline unsigned __int16 chr_len() const { return chr_to; };
inline const char_t* chr () const { return data; };
inline char_t* chr () { return data; };
inline const char_t* chr_end() const { return data + chr_to; };
inline char_t* chr_end() { return data + chr_to; };
inline uint16_t chr_len() const { return chr_to; };
};
#pragma pack(pop)
///
/// Character Index
///
class indexChr : public index<unsigned __int16, unsigned __int32, chrtag>
class indexChr : public index<uint16_t, uint32_t, chrtag>
{
public:
///
@ -82,7 +81,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexChr(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, chrtag>(h) {}
indexChr(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, chrtag>(h) {}
///
/// Compares two character tags by character (for searching)
@ -130,7 +129,7 @@ namespace ZRCola {
///
/// Tag Index
///
class indexTag : public index<unsigned __int16, unsigned __int32, chrtag>
class indexTag : public index<uint16_t, uint32_t, chrtag>
{
public:
///
@ -138,7 +137,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexTag(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, chrtag>(h) {}
indexTag(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, chrtag>(h) {}
///
/// Compares two character tags by tag (for searching)
@ -153,8 +152,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const chrtag &a, _In_ const chrtag &b) const
{
if (a.tag < b.tag) return -1;
else if (a.tag > b.tag) return 1;
if (a.tag < b.tag) return -1;
if (a.tag > b.tag) return 1;
return 0;
}
@ -182,7 +181,7 @@ namespace ZRCola {
}
} idxTag; ///< Tag index
std::vector<unsigned __int16> data; ///< Character tags data
std::vector<uint16_t> data; ///< Character tags data
public:
///
@ -210,13 +209,84 @@ namespace ZRCola {
/// \param[in ] fn_abort Pointer to function to periodically test for search cancellation
/// \param[in ] cookie Cookie for \p fn_abort call
///
bool Search(_In_ const std::map<tagid_t, unsigned __int16> &tags, _In_ const character_db &ch_db, _In_ const std::set<chrcatid_t> &cats, _Inout_ std::map<std::wstring, charrank_t> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie) = NULL, _In_opt_ void *cookie = NULL) const;
bool Search(_In_ const std::map<tagid_t, uint16_t> &tags, _In_ const character_db &ch_db, _In_ const std::set<chrcatid_t> &cats, _Inout_ std::map<string_t, charrank_t> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie) = NULL, _In_opt_ void *cookie = NULL) const;
///
/// Writes character tag database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character tag database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const chrtag_db& db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write tag index.
if (stream.fail()) return stream;
stream << db.idxTag;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads character tag database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character tag database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ chrtag_db& db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read tag index.
stream >> db.idxTag;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<chrtag_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> chrtag_rec;
///
/// Tag name database
///
@ -229,12 +299,12 @@ namespace ZRCola {
///
struct tagname {
public:
tagid_t tag; ///< Tag ID
LCID locale; ///< Locale ID
tagid_t tag; ///< Tag ID
uint32_t locale; ///< Locale ID
protected:
unsigned __int16 name_to; ///< Tag name end in \c data
wchar_t data[]; ///< Tag name
uint16_t name_to; ///< Tag name end in \c data
char_t data[]; ///< Tag name
private:
inline tagname(_In_ const tagname &other);
@ -250,22 +320,22 @@ namespace ZRCola {
/// \param[in] name_len Number of UTF-16 characters in \p name
///
inline tagname(
_In_opt_ tagid_t tag = 0,
_In_opt_ LCID locale = MAKELCID(MAKELANGID(LANG_NEUTRAL, SUBLANG_NEUTRAL), SORT_DEFAULT),
_In_opt_z_count_(name_len) const wchar_t *name = NULL,
_In_opt_ size_t name_len = 0)
_In_opt_ tagid_t tag = 0,
_In_opt_ uint32_t locale = 0,
_In_opt_z_count_(name_len) const char_t *name = NULL,
_In_opt_ size_t name_len = 0)
{
this->tag = tag;
this->locale = locale;
this->name_to = static_cast<unsigned __int16>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(wchar_t)*name_len);
this->name_to = static_cast<uint16_t>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(char_t)*name_len);
}
inline const wchar_t* name () const { return data; };
inline wchar_t* name () { return data; };
inline const wchar_t* name_end() const { return data + name_to; };
inline wchar_t* name_end() { return data + name_to; };
inline unsigned __int16 name_len() const { return name_to; };
inline const char_t* name () const { return data; };
inline char_t* name () { return data; };
inline const char_t* name_end() const { return data + name_to; };
inline char_t* name_end() { return data + name_to; };
inline uint16_t name_len() const { return name_to; };
///
/// Compares two names
@ -285,14 +355,26 @@ namespace ZRCola {
/// The function does not treat \\0 characters as terminators for performance reasons.
/// Therefore \p count_a and \p count_b must represent exact string lengths.
///
static inline int CompareName(LCID locale, const wchar_t *str_a, unsigned __int16 count_a, const wchar_t *str_b, unsigned __int16 count_b)
static inline int CompareName(_In_ uint32_t locale, _In_z_count_(count_a) const char_t *str_a, _In_ uint16_t count_a, _In_z_count_(count_b) const char_t *str_b, _In_ uint16_t count_b)
{
#ifdef _WIN32
switch (::CompareString(locale, SORT_STRINGSORT | NORM_IGNORECASE, str_a, count_a, str_b, count_b)) {
case CSTR_LESS_THAN : return -1;
case CSTR_EQUAL : return 0;
case CSTR_GREATER_THAN: return 1;
default : assert(0); return -1;
}
#else
assert(0); // TODO: 1. Should honour locale. 2. Should use ICU for lowercase conversion. 3. Should be UTF-16-aware.
string_t
a(str_a, count_a),
b(str_b, count_b);
auto tolower = [](char_t c){ return std::towlower(c); };
std::transform(a.begin(), a.end(), a.begin(), tolower);
std::transform(b.begin(), b.end(), b.begin(), tolower);
auto &coll = std::use_facet<std::collate<char_t>>(std::locale());
return coll.compare(&*a.cbegin(), &*a.cend(), &*b.cbegin(), &*b.cend());
#endif
}
};
#pragma pack(pop)
@ -300,7 +382,7 @@ namespace ZRCola {
///
/// Name index
///
class indexName : public index<unsigned __int16, unsigned __int32, tagname>
class indexName : public index<uint16_t, uint32_t, tagname>
{
public:
///
@ -309,7 +391,7 @@ namespace ZRCola {
/// \param[in] h Reference to vector holding the data
/// \param[in] locale Locale used to perform tag name comparison
///
indexName(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, tagname>(h) {}
indexName(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, tagname>(h) {}
///
/// Compares two tag names by locale and name (for searching)
@ -324,8 +406,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const tagname &a, _In_ const tagname &b) const
{
if (a.locale < b.locale) return -1;
else if (a.locale > b.locale) return 1;
if (a.locale < b.locale) return -1;
if (a.locale > b.locale) return 1;
int r = tagname::CompareName(a.locale, a.name(), a.name_len(), b.name(), b.name_len());
if (r != 0) return r;
@ -362,7 +444,7 @@ namespace ZRCola {
///
/// Tag index
///
class indexTag : public index<unsigned __int16, unsigned __int32, tagname>
class indexTag : public index<uint16_t, uint32_t, tagname>
{
public:
///
@ -371,7 +453,7 @@ namespace ZRCola {
/// \param[in] h Reference to vector holding the data
/// \param[in] locale Locale used to perform tag name comparison
///
indexTag(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, tagname>(h) {}
indexTag(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, tagname>(h) {}
///
/// Compares two tag names by tag (for searching)
@ -386,17 +468,17 @@ namespace ZRCola {
///
virtual int compare(_In_ const tagname &a, _In_ const tagname &b) const
{
if (a.locale < b.locale) return -1;
else if (a.locale > b.locale) return 1;
if (a.locale < b.locale) return -1;
if (a.locale > b.locale) return 1;
if (a.tag < b.tag) return -1;
else if (a.tag > b.tag) return 1;
if (a.tag < b.tag) return -1;
if (a.tag > b.tag) return 1;
return 0;
}
} idxTag; ///< Tag index
std::vector<unsigned __int16> data; ///< Tag data
std::vector<uint16_t> data; ///< Tag data
public:
///
@ -423,161 +505,82 @@ namespace ZRCola {
/// \param[in ] fn_abort Pointer to function to periodically test for search cancellation
/// \param[in ] cookie Cookie for \p fn_abort call
///
bool Search(_In_z_ const wchar_t *str, _In_ LCID locale, _Inout_ std::map<tagid_t, unsigned __int16> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie) = NULL, _In_opt_ void *cookie = NULL) const;
bool Search(_In_z_ const char_t *str, _In_ uint32_t locale, _Inout_ std::map<tagid_t, uint16_t> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie) = NULL, _In_opt_ void *cookie = NULL) const;
///
/// Writes tag database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Tag database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const tagname_db& db)
{
// Write name index.
if (stream.fail()) return stream;
stream << db.idxName;
// Write tag index.
if (stream.fail()) return stream;
stream << db.idxTag;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads tag database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Tag database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ tagname_db& db)
{
// Read name index.
stream >> db.idxName;
if (!stream.good()) return stream;
// Read tag index.
stream >> db.idxTag;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<tagname_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> tagname_rec;
};
const ZRCola::recordid_t ZRCola::chrtag_rec ::id = *(ZRCola::recordid_t*)"C-T";
const ZRCola::recordid_t ZRCola::tagname_rec::id = *(ZRCola::recordid_t*)"TGN";
///
/// Writes character tag database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character tag database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::chrtag_db &db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write tag index.
if (stream.fail()) return stream;
stream << db.idxTag;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads character tag database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character tag database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::chrtag_db &db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read tag index.
stream >> db.idxTag;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
///
/// Writes tag database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Tag database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::tagname_db &db)
{
// Write name index.
if (stream.fail()) return stream;
stream << db.idxName;
// Write tag index.
if (stream.fail()) return stream;
stream << db.idxTag;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads tag database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Tag database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::tagname_db &db)
{
// Read name index.
stream >> db.idxName;
if (!stream.good()) return stream;
// Read tag index.
stream >> db.idxTag;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
#pragma warning(pop)

View File

@ -8,7 +8,10 @@
#include "common.h"
#include "language.h"
#include <stdex/idrec>
namespace ZRCola {
class translation_db;
}
#include <algorithm>
#include <istream>
#include <ostream>
@ -45,12 +48,12 @@ namespace ZRCola {
///
/// Translation set ID
///
typedef unsigned __int16 transetid_t;
typedef uint16_t transetid_t;
///
/// Translation sequence ID
///
typedef unsigned __int16 transeqid_t;
typedef uint16_t transeqid_t;
///
/// Translation database
@ -64,14 +67,14 @@ namespace ZRCola {
///
struct translation {
public:
transetid_t set; ///< Translation set ID
unsigned __int16 dst_rank; ///< Destination character rank
unsigned __int16 src_rank; ///< Source character rank
transetid_t set; ///< Translation set ID
uint16_t dst_rank; ///< Destination character rank
uint16_t src_rank; ///< Source character rank
protected:
unsigned __int16 dst_to; ///< Destination character end in \c data
unsigned __int16 src_to; ///< Source string end in \c data
wchar_t data[]; ///< Destination string and source character
uint16_t dst_to; ///< Destination character end in \c data
uint16_t src_to; ///< Source string end in \c data
char_t data[]; ///< Destination string and source character
private:
inline translation(_In_ const translation &other);
@ -90,41 +93,41 @@ namespace ZRCola {
/// \param[in] src_len Number of UTF-16 characters in \p src
///
inline translation(
_In_opt_ transetid_t set = 0,
_In_opt_ unsigned __int16 dst_rank = 0,
_In_opt_z_count_(dst_len) const wchar_t *dst = NULL,
_In_opt_ size_t dst_len = 0,
_In_opt_ unsigned __int16 src_rank = 0,
_In_opt_z_count_(src_len) const wchar_t *src = NULL,
_In_opt_ size_t src_len = 0)
_In_opt_ transetid_t set = 0,
_In_opt_ uint16_t dst_rank = 0,
_In_opt_z_count_(dst_len) const char_t *dst = NULL,
_In_opt_ size_t dst_len = 0,
_In_opt_ uint16_t src_rank = 0,
_In_opt_z_count_(src_len) const char_t *src = NULL,
_In_opt_ size_t src_len = 0)
{
this->set = set;
this->dst_rank = dst_rank;
this->src_rank = src_rank;
this->dst_to = static_cast<unsigned __int16>(dst_len);
if (dst && dst_len) memcpy(this->data, dst, sizeof(wchar_t)*dst_len);
this->src_to = static_cast<unsigned __int16>(this->dst_to + src_len);
if (src && src_len) memcpy(this->data + this->dst_to, src, sizeof(wchar_t)*src_len);
this->dst_to = static_cast<uint16_t>(dst_len);
if (dst && dst_len) memcpy(this->data, dst, sizeof(char_t)*dst_len);
this->src_to = static_cast<uint16_t>(this->dst_to + src_len);
if (src && src_len) memcpy(this->data + this->dst_to, src, sizeof(char_t)*src_len);
}
inline const wchar_t* dst () const { return data; };
inline wchar_t* dst () { return data; };
inline const wchar_t* dst_end() const { return data + dst_to; };
inline wchar_t* dst_end() { return data + dst_to; };
inline unsigned __int16 dst_len() const { return dst_to; };
inline const char_t* dst () const { return data; };
inline char_t* dst () { return data; };
inline const char_t* dst_end() const { return data + dst_to; };
inline char_t* dst_end() { return data + dst_to; };
inline uint16_t dst_len() const { return dst_to; };
inline wchar_t dst_at(_In_ size_t i) const
inline char_t dst_at(_In_ size_t i) const
{
return i < dst_to ? data[i] : 0;
}
inline const wchar_t* src () const { return data + dst_to; };
inline wchar_t* src () { return data + dst_to; };
inline const wchar_t* src_end() const { return data + src_to; };
inline wchar_t* src_end() { return data + src_to; };
inline unsigned __int16 src_len() const { return src_to - dst_to; };
inline const char_t* src () const { return data + dst_to; };
inline char_t* src () { return data + dst_to; };
inline const char_t* src_end() const { return data + src_to; };
inline char_t* src_end() { return data + src_to; };
inline uint16_t src_len() const { return src_to - dst_to; };
inline wchar_t src_at(_In_ size_t i) const
inline char_t src_at(_In_ size_t i) const
{
size_t ii = i + dst_to; // absolute index
return ii < src_to ? data[ii] : 0;
@ -135,7 +138,7 @@ namespace ZRCola {
///
/// Translation index
///
class indexSrc : public index<unsigned __int16, unsigned __int32, translation>
class indexSrc : public index<uint16_t, uint32_t, translation>
{
public:
///
@ -143,7 +146,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexSrc(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, translation>(h) {}
indexSrc(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, translation>(h) {}
///
/// Compares two transformations by string (for searching)
@ -158,8 +161,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const translation &a, _In_ const translation &b) const
{
if (a.set < b.set) return -1;
else if (a.set > b.set) return +1;
if (a.set < b.set) return -1;
if (a.set > b.set) return +1;
int r = ZRCola::CompareString(a.src(), a.src_len(), b.src(), b.src_len());
if (r != 0) return r;
@ -200,7 +203,7 @@ namespace ZRCola {
///
/// Inverse translation index
///
class indexDst : public index<unsigned __int16, unsigned __int32, translation>
class indexDst : public index<uint16_t, uint32_t, translation>
{
public:
///
@ -208,7 +211,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexDst(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, translation>(h) {}
indexDst(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, translation>(h) {}
///
/// Compares two transformations by character (for searching)
@ -223,8 +226,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const translation &a, _In_ const translation &b) const
{
if (a.set < b.set) return -1;
else if (a.set > b.set) return +1;
if (a.set < b.set) return -1;
if (a.set > b.set) return +1;
int r = ZRCola::CompareString(a.dst(), a.dst_len(), b.dst(), b.dst_len());
if (r != 0) return r;
@ -262,7 +265,7 @@ namespace ZRCola {
} idxDst; ///< Inverse translation index
std::vector<unsigned __int16> data; ///< Transformation data
std::vector<uint16_t> data; ///< Transformation data
public:
///
@ -289,7 +292,7 @@ namespace ZRCola {
/// \param[out] output Output string (UTF-16)
/// \param[out] map The vector of source to destination index mappings (optional)
///
void Translate(_In_ transetid_t set, _In_z_count_(inputMax) const wchar_t* input, _In_ size_t inputMax, _Out_ std::wstring &output, _Out_opt_ std::vector<mapping>* map = NULL) const;
void Translate(_In_ transetid_t set, _In_z_count_(inputMax) const char_t* input, _In_ size_t inputMax, _Out_ string_t &output, _Out_opt_ std::vector<mapping>* map = NULL) const;
///
/// Inverse translates string
@ -300,7 +303,7 @@ namespace ZRCola {
/// \param[out] output Output string (UTF-16)
/// \param[out] map The vector of source to destination index mappings (optional)
///
inline void TranslateInv(_In_ transetid_t set, _In_z_count_(inputMax) const wchar_t* input, _In_ size_t inputMax, _Out_ std::wstring &output, _Out_opt_ std::vector<mapping>* map = NULL) const
inline void TranslateInv(_In_ transetid_t set, _In_z_count_(inputMax) const char_t* input, _In_ size_t inputMax, _Out_ string_t &output, _Out_opt_ std::vector<mapping>* map = NULL) const
{
TranslateInv(set, input, inputMax, NULL, langid_t::blank, output, map);
}
@ -316,13 +319,84 @@ namespace ZRCola {
/// \param[out] output Output string (UTF-16)
/// \param[out] map The vector of source to destination index mappings (optional)
///
void TranslateInv(_In_ transetid_t set, _In_z_count_(inputMax) const wchar_t* input, _In_ size_t inputMax, _In_opt_ const langchar_db *lc_db, _In_opt_ langid_t lang, _Out_ std::wstring &output, _Out_opt_ std::vector<mapping>* map = NULL) const;
void TranslateInv(_In_ transetid_t set, _In_z_count_(inputMax) const char_t* input, _In_ size_t inputMax, _In_opt_ const langchar_db *lc_db, _In_opt_ langid_t lang, _Out_ string_t &output, _Out_opt_ std::vector<mapping>* map = NULL) const;
///
/// Writes translation database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Translation database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::translation_db& db)
{
// Write translation index.
if (stream.fail()) return stream;
stream << db.idxSrc;
// Write inverse translation index.
if (stream.fail()) return stream;
stream << db.idxDst;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads translation database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Translation database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::translation_db& db)
{
// Read translation index.
stream >> db.idxSrc;
if (!stream.good()) return stream;
// Read inverse translation index.
stream >> db.idxDst;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<translation_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> translation_rec;
///
/// Translation set database
///
@ -335,12 +409,12 @@ namespace ZRCola {
///
struct transet {
public:
transetid_t set; ///< Translation set ID
transetid_t set; ///< Translation set ID
protected:
unsigned __int16 src_to; ///< Source name end in \c data
unsigned __int16 dst_to; ///< Sestination name end in \c data
wchar_t data[]; ///< Source and destination names
uint16_t src_to; ///< Source name end in \c data
uint16_t dst_to; ///< Destination name end in \c data
char_t data[]; ///< Source and destination names
private:
inline transet(_In_ const transet &other);
@ -358,36 +432,36 @@ namespace ZRCola {
///
inline transet(
_In_opt_ transetid_t set = 0,
_In_opt_z_count_(src_len) const wchar_t *src = NULL,
_In_opt_z_count_(src_len) const char_t *src = NULL,
_In_opt_ size_t src_len = 0,
_In_opt_z_count_(dst_len) const wchar_t *dst = NULL,
_In_opt_z_count_(dst_len) const char_t *dst = NULL,
_In_opt_ size_t dst_len = 0)
{
this->set = set;
this->src_to = static_cast<unsigned __int16>(src_len);
if (src && src_len) memcpy(this->data, src, sizeof(wchar_t)*src_len);
this->dst_to = static_cast<unsigned __int16>(this->src_to + dst_len);
if (dst && dst_len) memcpy(this->data + this->src_to, dst, sizeof(wchar_t)*dst_len);
this->src_to = static_cast<uint16_t>(src_len);
if (src && src_len) memcpy(this->data, src, sizeof(char_t)*src_len);
this->dst_to = static_cast<uint16_t>(this->src_to + dst_len);
if (dst && dst_len) memcpy(this->data + this->src_to, dst, sizeof(char_t)*dst_len);
}
inline const wchar_t* src () const { return data; };
inline wchar_t* src () { return data; };
inline const wchar_t* src_end() const { return data + src_to; };
inline wchar_t* src_end() { return data + src_to; };
inline unsigned __int16 src_len() const { return src_to; };
inline const char_t* src () const { return data; };
inline char_t* src () { return data; };
inline const char_t* src_end() const { return data + src_to; };
inline char_t* src_end() { return data + src_to; };
inline uint16_t src_len() const { return src_to; };
inline const wchar_t* dst () const { return data + src_to; };
inline wchar_t* dst () { return data + src_to; };
inline const wchar_t* dst_end() const { return data + dst_to; };
inline wchar_t* dst_end() { return data + dst_to; };
inline unsigned __int16 dst_len() const { return dst_to - src_to; };
inline const char_t* dst () const { return data + src_to; };
inline char_t* dst () { return data + src_to; };
inline const char_t* dst_end() const { return data + dst_to; };
inline char_t* dst_end() { return data + dst_to; };
inline uint16_t dst_len() const { return dst_to - src_to; };
};
#pragma pack(pop)
///
/// Translation set index
///
class indexTranSet : public index<unsigned __int16, unsigned __int32, transet>
class indexTranSet : public index<uint16_t, uint32_t, transet>
{
public:
///
@ -395,7 +469,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexTranSet(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, transet>(h) {}
indexTranSet(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, transet>(h) {}
///
/// Compares two translation sets by ID (for searching)
@ -410,14 +484,14 @@ namespace ZRCola {
///
virtual int compare(_In_ const transet &a, _In_ const transet &b) const
{
if (a.set < b.set) return -1;
else if (a.set > b.set) return 1;
if (a.set < b.set) return -1;
if (a.set > b.set) return 1;
return 0;
}
} idxTranSet; ///< Translation set index
std::vector<unsigned __int16> data; ///< Translation set data
std::vector<uint16_t> data; ///< Translation set data
public:
///
@ -433,12 +507,75 @@ namespace ZRCola {
idxTranSet.clear();
data .clear();
}
///
/// Writes translation set database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Translation set database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::transet_db& db)
{
// Write translation set index.
if (stream.fail()) return stream;
stream << db.idxTranSet;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads translation set database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Translation set database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::transet_db& db)
{
// Read translation set index.
stream >> db.idxTranSet;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<transet_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> transet_rec;
///
/// Translation sequence database
///
@ -451,13 +588,13 @@ namespace ZRCola {
///
struct transeq {
public:
transeqid_t seq; ///< Translation sequence ID
unsigned __int16 rank; ///< Translation sequence rank
transeqid_t seq; ///< Translation sequence ID
uint16_t rank; ///< Translation sequence rank
protected:
unsigned __int16 name_to; ///< Translation sequence name end in \c data
unsigned __int16 sets_to; ///< Translation sequence sets end in \c data
wchar_t data[]; ///< Translation sequence name and sets
uint16_t name_to; ///< Translation sequence name end in \c data
uint16_t sets_to; ///< Translation sequence sets end in \c data
char_t data[]; ///< Translation sequence name and sets
private:
inline transeq(_In_ const transeq &other);
@ -471,43 +608,43 @@ namespace ZRCola {
/// \param[in] rank Translation sequence rank
/// \param[in] name Translation sequence source
/// \param[in] name_len Number of UTF-16 characters in \p src
/// \param[in] sets Translation sequence destination
/// \param[in] sets_len Number of UTF-16 characters in \p sets
/// \param[in] sets Translation sequence destination
/// \param[in] sets_len Number of UTF-16 characters in \p sets
///
inline transeq(
_In_opt_ transeqid_t seq = 0,
_In_opt_ unsigned __int16 rank = 0,
_In_opt_z_count_(name_len) const wchar_t *name = NULL,
_In_opt_ size_t name_len = 0,
_In_opt_count_ (sets_len) const transetid_t *sets = NULL,
_In_opt_ size_t sets_len = 0)
_In_opt_ transeqid_t seq = 0,
_In_opt_ uint16_t rank = 0,
_In_opt_z_count_(name_len) const char_t *name = NULL,
_In_opt_ size_t name_len = 0,
_In_opt_count_ (sets_len) const transetid_t *sets = NULL,
_In_opt_ size_t sets_len = 0)
{
this->seq = seq;
this->rank = rank;
this->name_to = static_cast<unsigned __int16>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(wchar_t)*name_len);
this->sets_to = static_cast<unsigned __int16>(this->name_to + sets_len);
this->name_to = static_cast<uint16_t>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(char_t)*name_len);
this->sets_to = static_cast<uint16_t>(this->name_to + sets_len);
if (sets && sets_len) memcpy(this->data + this->name_to, sets, sizeof(transetid_t)*sets_len);
}
inline const wchar_t* name () const { return data; };
inline wchar_t* name () { return data; };
inline const wchar_t* name_end() const { return data + name_to; };
inline wchar_t* name_end() { return data + name_to; };
inline unsigned __int16 name_len() const { return name_to; };
inline const char_t* name () const { return data; };
inline char_t* name () { return data; };
inline const char_t* name_end() const { return data + name_to; };
inline char_t* name_end() { return data + name_to; };
inline uint16_t name_len() const { return name_to; };
inline const transetid_t* sets () const { return reinterpret_cast<const transetid_t*>(data + name_to); };
inline transetid_t* sets () { return reinterpret_cast< transetid_t*>(data + name_to); };
inline const transetid_t* sets_end() const { return reinterpret_cast<const transetid_t*>(data + sets_to); };
inline transetid_t* sets_end() { return reinterpret_cast< transetid_t*>(data + sets_to); };
inline unsigned __int16 sets_len() const { return sets_to - name_to; };
inline const transetid_t* sets () const { return reinterpret_cast<const transetid_t*>(data + name_to); };
inline transetid_t* sets () { return reinterpret_cast< transetid_t*>(data + name_to); };
inline const transetid_t* sets_end() const { return reinterpret_cast<const transetid_t*>(data + sets_to); };
inline transetid_t* sets_end() { return reinterpret_cast< transetid_t*>(data + sets_to); };
inline uint16_t sets_len() const { return sets_to - name_to; };
};
#pragma pack(pop)
///
/// Translation sequence index
///
class indexTranSeq : public index<unsigned __int16, unsigned __int32, transeq>
class indexTranSeq : public index<uint16_t, uint32_t, transeq>
{
public:
///
@ -515,7 +652,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexTranSeq(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, transeq>(h) {}
indexTranSeq(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, transeq>(h) {}
///
/// Compares two translation sequences by ID (for searching)
@ -530,8 +667,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const transeq &a, _In_ const transeq &b) const
{
if (a.seq < b.seq) return -1;
else if (a.seq > b.seq) return 1;
if (a.seq < b.seq) return -1;
if (a.seq > b.seq) return 1;
return 0;
}
@ -540,7 +677,7 @@ namespace ZRCola {
///
/// Rank index
///
class indexRank : public index<unsigned __int16, unsigned __int32, transeq>
class indexRank : public index<uint16_t, uint32_t, transeq>
{
public:
///
@ -548,7 +685,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexRank(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, transeq>(h) {}
indexRank(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, transeq>(h) {}
///
/// Compares two translation sets by rank (for searching)
@ -563,8 +700,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const transeq &a, _In_ const transeq &b) const
{
if (a.rank < b.rank) return -1;
else if (a.rank > b.rank) return +1;
if (a.rank < b.rank) return -1;
if (a.rank > b.rank) return +1;
return 0;
}
@ -585,19 +722,12 @@ namespace ZRCola {
if (a.rank < b.rank) return -1;
else if (a.rank > b.rank) return +1;
unsigned __int16
a_name_len = a.name_len(),
b_name_len = b.name_len();
int r = _wcsncoll(a.name(), b.name(), std::min<unsigned __int16>(a_name_len, b_name_len));
if (r != 0) return r;
if (a_name_len < b_name_len) return -1;
else if (a_name_len > b_name_len) return +1;
return 0;
auto &coll = std::use_facet<std::collate<char_t>>(std::locale());
return coll.compare(a.name(), a.name_end(), b.name(), b.name_end());
}
} idxRank; ///< Rank index
std::vector<unsigned __int16> data; ///< Translation sequence data
std::vector<uint16_t> data; ///< Translation sequence data
public:
///
@ -614,226 +744,81 @@ namespace ZRCola {
idxRank .clear();
data .clear();
}
///
/// Writes translation sequence database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Translation sequence database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::transeq_db& db)
{
// Write translation sequence index.
if (stream.fail()) return stream;
stream << db.idxTranSeq;
// Write rank index.
if (stream.fail()) return stream;
stream << db.idxRank;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads translation sequence database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Translation sequence database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::transeq_db& db)
{
// Read translation sequence index.
stream >> db.idxTranSeq;
if (!stream.good()) return stream;
// Read rank index.
stream >> db.idxRank;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<transeq_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> transeq_rec;
};
const ZRCola::recordid_t ZRCola::translation_rec::id = *(ZRCola::recordid_t*)"TRN";
const ZRCola::recordid_t ZRCola::transet_rec ::id = *(ZRCola::recordid_t*)"TSE";
const ZRCola::recordid_t ZRCola::transeq_rec ::id = *(ZRCola::recordid_t*)"TSQ";
///
/// Writes translation database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Translation database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::translation_db &db)
{
// Write translation index.
if (stream.fail()) return stream;
stream << db.idxSrc;
// Write inverse translation index.
if (stream.fail()) return stream;
stream << db.idxDst;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads translation database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Translation database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::translation_db &db)
{
// Read translation index.
stream >> db.idxSrc;
if (!stream.good()) return stream;
// Read inverse translation index.
stream >> db.idxDst;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
///
/// Writes translation set database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Translation set database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::transet_db &db)
{
// Write translation set index.
if (stream.fail()) return stream;
stream << db.idxTranSet;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads translation set database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Translation set database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::transet_db &db)
{
// Read translation set index.
stream >> db.idxTranSet;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
///
/// Writes translation sequence database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Translation sequence database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::transeq_db &db)
{
// Write translation sequence index.
if (stream.fail()) return stream;
stream << db.idxTranSeq;
// Write rank index.
if (stream.fail()) return stream;
stream << db.idxRank;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads translation sequence database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Translation sequence database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::transeq_db &db)
{
// Read translation sequence index.
stream >> db.idxTranSeq;
if (!stream.good()) return stream;
// Read rank index.
stream >> db.idxRank;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
#pragma warning(pop)

1
lib/libZRCola/lib/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/libZRCola.a

3
lib/libZRCola/src/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
/*.d
/*.gch
/*.o

View File

@ -9,7 +9,29 @@
const ZRCola::chrcatid_t ZRCola::chrcatid_t::blank = {};
bool ZRCola::character_db::Search(_In_z_ const wchar_t *str, _In_ const std::set<chrcatid_t> &cats, _Inout_ std::map<std::wstring, charrank_t> &hits, _Inout_ std::map<std::wstring, charrank_t> &hits_sub, _In_opt_ bool (__cdecl *fn_abort)(void *cookie), _In_opt_ void *cookie) const
#ifndef _WIN32
_Use_decl_annotations_
size_t ZRCola::wcslen(const char_t *str)
{
for (size_t i = 0; ; ++i)
if (!str[i])
return i;
}
_Use_decl_annotations_
size_t ZRCola::wcsnlen(const char_t *str, size_t count)
{
for (size_t i = 0; ; ++i)
if (i >= count || !str[i])
return i;
}
#endif
_Use_decl_annotations_
bool ZRCola::character_db::Search(const char_t *str, const std::set<chrcatid_t> &cats, std::map<string_t, charrank_t> &hits, std::map<string_t, charrank_t> &hits_sub, bool (__cdecl *fn_abort)(void *cookie), void *cookie) const
{
assert(str);
@ -27,14 +49,14 @@ bool ZRCola::character_db::Search(_In_z_ const wchar_t *str, _In_ const std::set
}
// Get term.
std::wstring term;
if (*str == L'"') {
const wchar_t *str_end = ++str;
string_t term;
if (*str == u'"') {
const auto *str_end = ++str;
for (;;) {
if (*str_end == 0) {
term.assign(str, str_end);
break;
} else if (*str_end == L'"') {
} else if (*str_end == u'"') {
term.assign(str, str_end);
str_end++;
break;
@ -43,7 +65,7 @@ bool ZRCola::character_db::Search(_In_z_ const wchar_t *str, _In_ const std::set
}
str = str_end;
} else {
const wchar_t *str_end = str + 1;
const auto *str_end = str + 1;
for (; *str_end && !iswspace(*str_end); str_end++);
term.assign(str, str_end);
str = str_end;
@ -57,7 +79,7 @@ bool ZRCola::character_db::Search(_In_z_ const wchar_t *str, _In_ const std::set
if (fn_abort && fn_abort(cookie)) return false;
const wchar_t *val;
const char_t *val;
size_t val_len;
if (idxDsc.find(term.c_str(), term.size(), &val, &val_len)) {
@ -66,7 +88,7 @@ bool ZRCola::character_db::Search(_In_z_ const wchar_t *str, _In_ const std::set
if (fn_abort && fn_abort(cookie)) return false;
j = wcsnlen(val + i, val_len - i);
if (cats.find(GetCharCat(val + i, j)) != cats.end()) {
std::wstring c(val + i, j);
string_t c(val + i, j);
auto idx = hits.find(c);
if (idx == hits.end()) {
// New character.
@ -85,7 +107,7 @@ bool ZRCola::character_db::Search(_In_z_ const wchar_t *str, _In_ const std::set
if (fn_abort && fn_abort(cookie)) return false;
j = wcsnlen(val + i, val_len - i);
if (cats.find(GetCharCat(val + i, j)) != cats.end()) {
std::wstring c(val + i, j);
string_t c(val + i, j);
auto idx = hits_sub.find(c);
if (idx == hits_sub.end()) {
// New character.

View File

@ -7,3 +7,78 @@
const ZRCola::langid_t ZRCola::langid_t::blank = {};
_Use_decl_annotations_
int ZRCola::CompareString(const char_t* str_a, size_t count_a, const char_t* str_b, size_t count_b)
{
for (size_t i = 0; ; i++) {
if (i >= count_a && i >= count_b) return 0;
else if (i >= count_a && i < count_b) return -1;
else if (i < count_a && i >= count_b) return +1;
else if (str_a[i] < str_b[i]) return -1;
else if (str_a[i] > str_b[i]) return +1;
}
}
_Use_decl_annotations_
inline std::string ZRCola::GetUnicodeDumpA(const char_t* str, size_t count, const char* sep)
{
std::string out;
size_t sep_len = strlen(sep);
size_t dump_len_max = sep_len + 4 + 1;
char* dump;
std::unique_ptr<char[]> dump_obj(dump = new char[dump_len_max]);
if (count && str[0]) {
size_t i = 0;
static const char error[] = "????";
int n = snprintf(dump, dump_len_max, "%04X", str[i++]);
if (n >= 0)
out.insert(out.end(), dump, dump + n);
else
out.insert(out.end(), error, error + std::size(error) - 1);
while (i < count && str[i]) {
n = snprintf(dump, dump_len_max, "%s%04X", sep, str[i++]);
if (n >= 0)
out.insert(out.end(), dump, dump + n);
else {
out.insert(out.end(), sep, sep + sep_len);
out.insert(out.end(), error, error + std::size(error) - 1);
}
}
}
return out;
}
_Use_decl_annotations_
std::wstring ZRCola::GetUnicodeDumpW(const char_t* str, size_t count, const wchar_t* sep)
{
std::wstring out;
size_t sep_len = ::wcslen(sep);
size_t dump_len_max = sep_len + 4 + 1;
wchar_t* dump;
std::unique_ptr<wchar_t[]> dump_obj(dump = new wchar_t[dump_len_max]);
if (count && str[0]) {
size_t i = 0;
static const wchar_t error[] = L"????";
int n = swprintf(dump, dump_len_max, L"%04X", str[i++]);
if (n >= 0)
out.insert(out.end(), dump, dump + n);
else
out.insert(out.end(), error, error + std::size(error) - 1);
while (i < count && str[i]) {
n = swprintf(dump, dump_len_max, L"%s%04X", sep, str[i++]);
if (n >= 0)
out.insert(out.end(), dump, dump + n);
else {
out.insert(out.end(), sep, sep + sep_len);
out.insert(out.end(), error, error + std::size(error) - 1);
}
}
}
return out;
}

View File

@ -6,7 +6,7 @@
#include "pch.h"
_Use_decl_annotations_
void ZRCola::highlight_db::Highlight(const wchar_t* input, size_t inputMax, std::function<void (hlghtsetid_t set, size_t start, size_t end)> callback) const
void ZRCola::highlight_db::Highlight(const char_t* input, size_t inputMax, std::function<void (hlghtsetid_t set, size_t start, size_t end)> callback) const
{
size_t start = 0;
hlghtsetid_t set = ZRCOLA_HLGHTSETID_DEFAULT;
@ -15,7 +15,7 @@ void ZRCola::highlight_db::Highlight(const wchar_t* input, size_t inputMax, std:
// Find the longest matching highlight at i-th character.
size_t l_match = (size_t)-1;
for (size_t l = 0, r = idxChr.size(), ii = i, j = 0; ii < inputMax && l < r; ii++, j++) {
wchar_t c = input[ii];
auto c = input[ii];
while (l < r) {
// Test the highlight in the middle of the search area.
size_t m = (l + r) / 2;
@ -23,7 +23,7 @@ void ZRCola::highlight_db::Highlight(const wchar_t* input, size_t inputMax, std:
// Get the j-th character of the highlight.
// All highlights that get short on characters are lexically ordered before.
// Thus the j-th character is considered 0.
wchar_t s = idxChr[m].chr_at(j);
auto s = idxChr[m].chr_at(j);
// Do the bisection test.
if (c < s) r = m;

View File

@ -57,12 +57,12 @@ void ZRCola::LangConvert(_In_ LANGID lang_win, _Inout_ ZRCola::langid_t &lang)
#endif
bool ZRCola::langchar_db::IsLocalCharacter(_In_ const wchar_t *chr, _In_ const wchar_t *chr_end, _In_ ZRCola::langid_t lang) const
bool ZRCola::langchar_db::IsLocalCharacter(_In_ const char_t *chr, _In_ const char_t *chr_end, _In_ ZRCola::langid_t lang) const
{
size_t n = chr_end - chr;
assert(n <= 0xffff);
std::unique_ptr<langchar> lc((langchar*)new char[sizeof(langchar) + sizeof(wchar_t)*n]);
lc->langchar::langchar(lang, chr, n);
std::unique_ptr<langchar> lc((langchar*)new char[sizeof(langchar) + sizeof(char_t)*n]);
new (lc.get()) langchar(lang, chr, n);
indexChr::size_type start;
return idxChr.find(*lc, start);
}

View File

@ -1,68 +0,0 @@
/*
SPDX-License-Identifier: GPL-3.0-or-later
Copyright © 2015-2022 Amebis
*/
#include "pch.h"
size_t ZRCola::mapping_vector::to_src(_In_ size_t dst) const
{
if (empty()) {
// One-to-one mapping.
return dst;
}
for (size_type l = 0, r = size();;) {
if (l < r) {
size_type m = (l + r) / 2;
const mapping &el = (*this)[m];
if ( dst < el.dst) r = m;
else if (el.dst < dst) l = m + 1;
else {
// An exact match found.
return el.src;
}
} else if (l) {
// We found a map interval.
const mapping &el = (*this)[l - 1];
return el.src + (dst - el.dst);
} else {
// The destination character index is left of the first transformation.
const mapping &el = (*this)[0];
return std::min<size_t>(dst, el.src);
}
}
}
size_t ZRCola::mapping_vector::to_dst(_In_ size_t src) const
{
if (empty()) {
// One-to-one mapping.
return src;
}
for (size_type l = 0, r = size();;) {
if (l < r) {
size_type m = (l + r) / 2;
const mapping &el = (*this)[m];
if ( src < el.src) r = m;
else if (el.src < src) l = m + 1;
else {
// An exact match found.
return el.dst;
}
} else if (l) {
// We found a map interval.
const mapping &el = (*this)[l - 1];
return el.dst + (src - el.src);
} else {
// The source character index is left of the first transformation.
const mapping &el = (*this)[0];
return std::min<size_t>(src, el.dst);
}
}
}

View File

@ -3,7 +3,8 @@
Copyright © 2015-2022 Amebis
*/
#pragma once
#ifndef __PCH_H__
#define __PCH_H__
#include "../../../include/version.h"
@ -17,3 +18,5 @@
#include <algorithm>
#include <cwctype>
#endif

View File

@ -6,7 +6,7 @@
#include "pch.h"
bool ZRCola::chrtag_db::Search(_In_ const std::map<tagid_t, unsigned __int16> &tags, _In_ const character_db &ch_db, _In_ const std::set<chrcatid_t> &cats, _Inout_ std::map<std::wstring, charrank_t> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie), _In_opt_ void *cookie) const
bool ZRCola::chrtag_db::Search(_In_ const std::map<tagid_t, uint16_t> &tags, _In_ const character_db &ch_db, _In_ const std::set<chrcatid_t> &cats, _Inout_ std::map<string_t, charrank_t> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie), _In_opt_ void *cookie) const
{
for (auto tag = tags.cbegin(), tag_end = tags.cend(); tag != tag_end; ++tag) {
if (fn_abort && fn_abort(cookie)) return false;
@ -17,9 +17,9 @@ bool ZRCola::chrtag_db::Search(_In_ const std::map<tagid_t, unsigned __int16> &t
for (size_t i = start; i < end; i++) {
if (fn_abort && fn_abort(cookie)) return false;
const chrtag &ct = idxTag[i];
unsigned __int16 len = ct.chr_len();
uint16_t len = ct.chr_len();
if (cats.find(ch_db.GetCharCat(ct.chr(), len)) != cats.end()) {
std::wstring chr(ct.chr(), len);
string_t chr(ct.chr(), len);
auto idx = hits.find(chr);
if (idx == hits.end()) {
// New character.
@ -37,7 +37,7 @@ bool ZRCola::chrtag_db::Search(_In_ const std::map<tagid_t, unsigned __int16> &t
}
bool ZRCola::tagname_db::Search(_In_z_ const wchar_t *str, _In_ LCID locale, _Inout_ std::map<tagid_t, unsigned __int16> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie), _In_opt_ void *cookie) const
bool ZRCola::tagname_db::Search(_In_z_ const char_t *str, _In_ uint32_t locale, _Inout_ std::map<tagid_t, uint16_t> &hits, _In_opt_ bool (__cdecl *fn_abort)(void *cookie), _In_opt_ void *cookie) const
{
assert(str);
@ -55,14 +55,14 @@ bool ZRCola::tagname_db::Search(_In_z_ const wchar_t *str, _In_ LCID locale, _In
}
// Get name.
std::wstring name;
if (*str == L'"') {
const wchar_t *str_end = ++str;
string_t name;
if (*str == u'"') {
const auto *str_end = ++str;
for (;;) {
if (*str_end == 0) {
name.assign(str, str_end);
break;
} else if (*str_end == L'"') {
} else if (*str_end == u'"') {
name.assign(str, str_end);
str_end++;
break;
@ -71,7 +71,7 @@ bool ZRCola::tagname_db::Search(_In_z_ const wchar_t *str, _In_ LCID locale, _In
}
str = str_end;
} else {
const wchar_t *str_end = str + 1;
const auto *str_end = str + 1;
for (; *str_end && !iswspace(*str_end); str_end++);
name.assign(str, str_end);
str = str_end;
@ -81,8 +81,8 @@ bool ZRCola::tagname_db::Search(_In_z_ const wchar_t *str, _In_ LCID locale, _In
if (fn_abort && fn_abort(cookie)) return false;
// Find the name.
std::unique_ptr<tagname> tn(reinterpret_cast<tagname*>(new char[sizeof(tagname) + sizeof(wchar_t)*name.length()]));
tn->tagname::tagname(0, locale, name.data(), name.length());
std::unique_ptr<tagname> tn(reinterpret_cast<tagname*>(new char[sizeof(tagname) + sizeof(char_t)*name.length()]));
new (tn.get()) tagname(0, locale, name.data(), name.length());
size_t start, end;
if (idxName.find(*tn, start, end)) {
// The name was found.
@ -92,7 +92,7 @@ bool ZRCola::tagname_db::Search(_In_z_ const wchar_t *str, _In_ LCID locale, _In
auto idx = hits.find(val.tag);
if (idx == hits.end()) {
// New tag.
hits.insert(std::make_pair(val.tag, (unsigned __int16)1));
hits.insert(std::make_pair(val.tag, (uint16_t)1));
} else {
// Increase count for existing tag.
idx->second++;

View File

@ -6,7 +6,7 @@
#include "pch.h"
void ZRCola::translation_db::Translate(_In_ transetid_t set, _In_z_count_(inputMax) const wchar_t* input, _In_ size_t inputMax, _Out_ std::wstring &output, _Out_opt_ std::vector<mapping>* map) const
void ZRCola::translation_db::Translate(_In_ transetid_t set, _In_z_count_(inputMax) const char_t* input, _In_ size_t inputMax, _Out_ string_t &output, _Out_opt_ std::vector<mapping>* map) const
{
assert(input || inputMax == 0);
@ -28,7 +28,7 @@ void ZRCola::translation_db::Translate(_In_ transetid_t set, _In_z_count_(inputM
// Find the longest matching translation at i-th character.
size_t l_match = (size_t)-1;
for (size_t l = l_set, r = r_set, ii = i, j = 0; ii < inputMax && l < r; ii++, j++) {
wchar_t c = input[ii];
auto c = input[ii];
while (l < r) {
// Test the translation in the middle of the search area.
size_t m = (l + r) / 2;
@ -36,7 +36,7 @@ void ZRCola::translation_db::Translate(_In_ transetid_t set, _In_z_count_(inputM
// Get the j-th character of the translation.
// All translations that get short on characters are lexically ordered before.
// Thus the j-th character is considered 0.
wchar_t s = idxSrc[m].src_at(j);
auto s = idxSrc[m].src_at(j);
// Do the bisection test.
if (c < s) r = m;
@ -84,7 +84,7 @@ void ZRCola::translation_db::Translate(_In_ transetid_t set, _In_z_count_(inputM
}
void ZRCola::translation_db::TranslateInv(_In_ transetid_t set, _In_z_count_(inputMax) const wchar_t* input, _In_ size_t inputMax, _In_opt_ const langchar_db *lc_db, _In_opt_ langid_t lang, _Out_ std::wstring &output, _Out_opt_ std::vector<mapping>* map) const
void ZRCola::translation_db::TranslateInv(_In_ transetid_t set, _In_z_count_(inputMax) const char_t* input, _In_ size_t inputMax, _In_opt_ const langchar_db *lc_db, _In_opt_ langid_t lang, _Out_ string_t &output, _Out_opt_ std::vector<mapping>* map) const
{
assert(input || inputMax == 0);
@ -106,7 +106,7 @@ void ZRCola::translation_db::TranslateInv(_In_ transetid_t set, _In_z_count_(inp
// Find the longest matching inverse translation at i-th character.
size_t l_match = (size_t)-1;
for (size_t l = l_set, r = r_set, ii = i, j = 0; ii < inputMax && l < r; ii++, j++) {
wchar_t c = input[ii];
auto c = input[ii];
while (l < r) {
// Test the inverse translation in the middle of the search area.
size_t m = (l + r) / 2;
@ -114,7 +114,7 @@ void ZRCola::translation_db::TranslateInv(_In_ transetid_t set, _In_z_count_(inp
// Get the j-th character of the inverse translation.
// All inverse translations that get short on characters are lexically ordered before.
// Thus the j-th character is considered 0.
wchar_t s = idxDst[m].dst_at(j);
auto s = idxDst[m].dst_at(j);
// Do the bisection test.
if (c < s) r = m;
@ -147,7 +147,7 @@ void ZRCola::translation_db::TranslateInv(_In_ transetid_t set, _In_z_count_(inp
if (l_match < r_set) {
// The saved inverse translation was an exact match.
const translation &trans = idxDst[l_match];
if (trans.src_len() && trans.src()[0] != L'#' && (!lc_db || !lc_db->IsLocalCharacter(trans.dst(), trans.dst_end(), lang))) {
if (trans.src_len() && trans.src()[0] != u'#' && (!lc_db || !lc_db->IsLocalCharacter(trans.dst(), trans.dst_end(), lang))) {
// Append source sequence.
output.append(trans.src(), trans.src_end());
i += trans.dst_len();

2
lib/libZRCola/test/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
/*.d
/test

129
lib/libZRCola/test/test.cpp Normal file
View File

@ -0,0 +1,129 @@
#include <zrcola/idrec.h>
#include <fstream>
#include <iostream>
#include <typeinfo>
using namespace std;
using namespace ZRCola;
translation_db t_db;
transet_db ts_db;
transeq_db tsq_db;
langchar_db lc_db;
language_db lang_db;
character_db chr_db;
chrcat_db cc_db;
chrtag_db ct_db;
tagname_db tn_db;
highlight_db h_db;
static void load_database()
{
fstream dat("../../../output/data/ZRCola.zrcdb", ios_base::in | ios_base::binary);
if (!dat.good())
throw runtime_error("ZRCola.zrcdb not found or cannot be opened.");
if (!stdex::idrec::find<recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN>(dat, ZRCOLA_DB_ID, sizeof(recordid_t)))
throw runtime_error("ZRCola.zrcdb is not a valid ZRCola database.");
recordsize_t size;
dat.read((char*)&size, sizeof(recordsize_t));
if (dat.good()) {
bool has_translation_data = false;
for (;;) {
recordid_t id;
if (!stdex::idrec::read_id(dat, id, size)) break;
if (id == translation_rec::id()) {
dat >> translation_rec(t_db);
if (dat.good()) {
has_translation_data = true;
} else {
cerr << "Error reading translation data from ZRCola.zrcdb.\n";
t_db.clear();
}
} else if (id == transet_rec::id()) {
dat >> transet_rec(ts_db);
if (!dat.good()) {
cerr << "Error reading translation set data from ZRCola.zrcdb.\n";
ts_db.clear();
}
} else if (id == transeq_rec::id()) {
dat >> transeq_rec(tsq_db);
if (!dat.good()) {
cerr << "Error reading translation sequence data from ZRCola.zrcdb.\n";
tsq_db.clear();
}
} else if (id == langchar_rec::id()) {
dat >> langchar_rec(lc_db);
if (!dat.good()) {
cerr << "Error reading language character data from ZRCola.zrcdb.\n";
lc_db.clear();
}
} else if (id == language_rec::id()) {
dat >> language_rec(lang_db);
if (!dat.good()) {
cerr << "Error reading language character data from ZRCola.zrcdb.\n";
lang_db.clear();
}
} else if (id == character_rec::id()) {
dat >> character_rec(chr_db);
if (!dat.good()) {
cerr << "Error reading character data from ZRCola.zrcdb.\n";
chr_db.clear();
}
} else if (id == chrcat_rec::id()) {
dat >> chrcat_rec(cc_db);
if (!dat.good()) {
cerr << "Error reading character category data from ZRCola.zrcdb.\n";
cc_db.clear();
}
} else if (id == chrtag_rec::id()) {
dat >> chrtag_rec(ct_db);
if (!dat.good()) {
cerr << "Error reading character tag data from ZRCola.zrcdb.\n";
ct_db.clear();
}
} else if (id == tagname_rec::id()) {
dat >> tagname_rec(tn_db);
if (!dat.good()) {
cerr << "Error reading tag name data from ZRCola.zrcdb.\n";
tn_db.clear();
}
} else if (id == highlight_rec::id()) {
dat >> highlight_rec(h_db);
if (!dat.good()) {
cerr << "Error reading highlight data from ZRCola.zrcdb.\n";
h_db.clear();
}
} else
stdex::idrec::ignore<recordsize_t, ZRCOLA_RECORD_ALIGN>(dat);
}
if (!has_translation_data)
throw runtime_error("ZRCola.zrcdb has no translation data.");
}
}
int main()
{
try {
load_database();
u16string output;
vector<mapping> map;
t_db.Translate(ZRCOLA_TRANSETID_DEFAULT, u"", -1, output, &map);
if (!output.empty()) throw runtime_error("Empty string translated to nonempty output.");
if (!map.empty()) throw runtime_error("Empty string translation produced non-empty map.");
t_db.Translate(ZRCOLA_TRANSETID_DEFAULT, u"To je test.", -1, output, &map);
if (output != u"T  ťᵉⓢṭ.") throw runtime_error("Unexpected translation.");
cout << "Passed\n";
return 0;
} catch (exception &ex) {
cerr << typeid(ex).name() << ": " << ex.what() << endl;
return 1;
}
}

View File

@ -7,10 +7,11 @@
#include <zrcola/common.h>
#include <stdex/idrec>
#include <stdex/idrec.hpp>
#include <assert.h>
#include <algorithm>
#include <istream>
#include <locale>
#include <ostream>
#include <vector>
@ -24,7 +25,7 @@ namespace ZRCola {
///
/// Character group ID
///
typedef unsigned __int16 chrgrpid_t;
typedef uint16_t chrgrpid_t;
///
@ -39,13 +40,13 @@ namespace ZRCola {
///
struct chrgrp {
public:
chrgrpid_t grp; ///< Character group ID
unsigned __int16 rank; ///< Character group rank
chrgrpid_t grp; ///< Character group ID
uint16_t rank; ///< Character group rank
protected:
unsigned __int16 name_to; ///< Character group name end in \c data
unsigned __int16 chrlst_to; ///< Character list end in \c data
wchar_t data[]; ///< Character group name, character list, bit vector if particular character is displayed initially
uint16_t name_to; ///< Character group name end in \c data
uint16_t chrlst_to; ///< Character list end in \c data
char_t data[]; ///< Character group name, character list, bit vector if particular character is displayed initially
public:
///
@ -60,49 +61,49 @@ namespace ZRCola {
/// \param[in] chrshow Binary vector which particular character is displayed initially
///
inline chrgrp(
_In_opt_ chrgrpid_t grp = 0,
_In_opt_ unsigned __int16 rank = 0,
_In_opt_z_count_(name_len) const wchar_t *name = NULL,
_In_opt_ size_t name_len = 0,
_In_opt_z_count_(chrlst_len) const wchar_t *chrlst = NULL,
_In_opt_ size_t chrlst_len = 0,
_In_opt_count_x_((chrlst_len + 15)/16) const unsigned __int16 *chrshow = NULL)
_In_opt_ chrgrpid_t grp = 0,
_In_opt_ uint16_t rank = 0,
_In_opt_z_count_(name_len) const char_t *name = NULL,
_In_opt_ size_t name_len = 0,
_In_opt_z_count_(chrlst_len) const char_t *chrlst = NULL,
_In_opt_ size_t chrlst_len = 0,
_In_opt_count_x_((chrlst_len + 15)/16) const uint16_t *chrshow = NULL)
{
this->grp = grp;
this->rank = rank;
this->name_to = static_cast<unsigned __int16>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(wchar_t)*name_len);
this->chrlst_to = static_cast<unsigned __int16>(this->name_to + chrlst_len);
this->name_to = static_cast<uint16_t>(name_len);
if (name && name_len) memcpy(this->data, name, sizeof(char_t)*name_len);
this->chrlst_to = static_cast<uint16_t>(this->name_to + chrlst_len);
if (chrlst && chrshow && chrlst_len) {
memcpy(this->data + this->name_to, chrlst, sizeof(wchar_t)*chrlst_len);
memcpy(this->data + this->name_to, chrlst, sizeof(char_t)*chrlst_len);
memcpy(this->data + this->chrlst_to, chrshow, (chrlst_len + sizeof(*data)*8 - 1)/8);
}
}
inline const wchar_t* name () const { return data; };
inline wchar_t* name () { return data; };
inline const wchar_t* name_end() const { return data + name_to; };
inline wchar_t* name_end() { return data + name_to; };
inline unsigned __int16 name_len() const { return name_to; };
inline const char_t* name () const { return data; };
inline char_t* name () { return data; };
inline const char_t* name_end() const { return data + name_to; };
inline char_t* name_end() { return data + name_to; };
inline uint16_t name_len() const { return name_to; };
inline const wchar_t* chrlst () const { return data + name_to; };
inline wchar_t* chrlst () { return data + name_to; };
inline const wchar_t* chrlst_end() const { return data + chrlst_to; };
inline wchar_t* chrlst_end() { return data + chrlst_to; };
inline unsigned __int16 chrlst_len() const { return chrlst_to - name_to; };
inline const char_t* chrlst () const { return data + name_to; };
inline char_t* chrlst () { return data + name_to; };
inline const char_t* chrlst_end() const { return data + chrlst_to; };
inline char_t* chrlst_end() { return data + chrlst_to; };
inline uint16_t chrlst_len() const { return chrlst_to - name_to; };
inline const unsigned __int16* chrshow () const { return reinterpret_cast<const unsigned __int16*>(data + chrlst_to ); };
inline unsigned __int16* chrshow () { return reinterpret_cast< unsigned __int16*>(data + chrlst_to ); };
inline const unsigned __int16* chrshow_end() const { return reinterpret_cast<const unsigned __int16*>(data + chrlst_to + chrshow_len()); };
inline unsigned __int16* chrshow_end() { return reinterpret_cast< unsigned __int16*>(data + chrlst_to + chrshow_len()); };
inline unsigned __int16 chrshow_len() const { return (chrlst_len() + sizeof(*data)*8 - 1)/(sizeof(*data)*8); };
inline const uint16_t* chrshow () const { return reinterpret_cast<const uint16_t*>(data + chrlst_to ); };
inline uint16_t* chrshow () { return reinterpret_cast< uint16_t*>(data + chrlst_to ); };
inline const uint16_t* chrshow_end() const { return reinterpret_cast<const uint16_t*>(data + chrlst_to + chrshow_len()); };
inline uint16_t* chrshow_end() { return reinterpret_cast< uint16_t*>(data + chrlst_to + chrshow_len()); };
inline uint16_t chrshow_len() const { return (chrlst_len() + sizeof(*data)*8 - 1)/(sizeof(*data)*8); };
};
#pragma pack(pop)
///
/// Rank index
///
class indexRank : public index<unsigned __int16, unsigned __int32, chrgrp>
class indexRank : public index<uint16_t, uint32_t, chrgrp>
{
public:
///
@ -110,7 +111,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexRank(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, chrgrp>(h) {}
indexRank(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, chrgrp>(h) {}
///
/// Compares two character groups by rank (for searching)
@ -125,8 +126,8 @@ namespace ZRCola {
///
virtual int compare(_In_ const chrgrp &a, _In_ const chrgrp &b) const
{
if (a.rank < b.rank) return -1;
else if (a.rank > b.rank) return +1;
if (a.rank < b.rank) return -1;
if (a.rank > b.rank) return +1;
return 0;
}
@ -147,97 +148,88 @@ namespace ZRCola {
if (a.rank < b.rank) return -1;
else if (a.rank > b.rank) return +1;
unsigned __int16
a_name_len = a.name_len(),
b_name_len = b.name_len();
int r = _wcsncoll(a.name(), b.name(), std::min<unsigned __int16>(a_name_len, b_name_len));
if (r != 0) return r;
if (a_name_len < b_name_len) return -1;
else if (a_name_len > b_name_len) return +1;
return 0;
auto &coll = std::use_facet<std::collate<char_t>>(std::locale());
return coll.compare(a.name(), a.name_end(), b.name(), b.name_end());
}
} idxRank; ///< Rank index
std::vector<unsigned __int16> data; ///< Character groups data
std::vector<uint16_t> data; ///< Character groups data
public:
///
/// Constructs the database
///
inline chrgrp_db() : idxRank(data) {}
///
/// Writes character group database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character group database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const chrgrp_db& db)
{
// Write rank index.
if (stream.fail()) return stream;
stream << db.idxRank;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads character group database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character group database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ chrgrp_db& db)
{
// Read rank index.
stream >> db.idxRank;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<chrgrp_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> chrgrp_rec;
typedef stdex::idrec::record<chrgrp_db, recordid_t, 0x524743 /*"CGR"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> chrgrp_rec;
};
const ZRCola::recordid_t ZRCola::chrgrp_rec::id = *(ZRCola::recordid_t*)"CGR";
///
/// Writes character group database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Character group database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::chrgrp_db &db)
{
// Write rank index.
if (stream.fail()) return stream;
stream << db.idxRank;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads character group database from a stream
///
/// \param[in ] stream Input stream
/// \param[out] db Character group database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::chrgrp_db &db)
{
// Read rank index.
stream >> db.idxRank;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
#pragma warning(pop)

View File

@ -7,7 +7,7 @@
#include <zrcola/common.h>
#include <stdex/idrec>
#include <stdex/idrec.hpp>
#include <wxex/common.h>
#pragma warning(push)
@ -39,20 +39,20 @@ namespace ZRCola {
struct keyseq {
public:
enum modifiers_t {
SHIFT = 1<<0, ///< SHIFT key was pressed
CTRL = 1<<1, ///< CTRL key was pressed
ALT = 1<<2, ///< ALT key was pressed
SHIFT = 1<<0, ///< SHIFT key was pressed
CTRL = 1<<1, ///< CTRL key was pressed
ALT = 1<<2, ///< ALT key was pressed
};
struct key_t {
wchar_t key; ///< Key
unsigned __int16 modifiers; ///< Modifiers (bitwise combination of SHIFT, CTRL and ALT)
char_t key; ///< Key
uint16_t modifiers; ///< Modifiers (bitwise combination of SHIFT, CTRL and ALT)
};
protected:
unsigned __int16 chr_to; ///< Character end in \c data
unsigned __int16 seq_to; ///< Key sequence end in \c data
wchar_t data[]; ///< Character and key sequence
uint16_t chr_to; ///< Character end in \c data
uint16_t seq_to; ///< Key sequence end in \c data
char_t data[]; ///< Character and key sequence
public:
///
@ -64,28 +64,28 @@ namespace ZRCola {
/// \param[in] chr_len Number of UTF-16 characters in \p chr
///
inline keyseq(
_In_opt_count_(seq_count) const key_t *seq = NULL,
_In_opt_ size_t seq_count = 0,
_In_opt_z_count_(chr_len) const wchar_t *chr = NULL,
_In_opt_ size_t chr_len = 0)
_In_opt_count_(seq_count) const key_t *seq = NULL,
_In_opt_ size_t seq_count = 0,
_In_opt_z_count_(chr_len) const char_t *chr = NULL,
_In_opt_ size_t chr_len = 0)
{
this->chr_to = static_cast<unsigned __int16>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(wchar_t)*chr_len);
this->seq_to = static_cast<unsigned __int16>(this->chr_to + seq_count * sizeof(key_t) / sizeof(*data));
this->chr_to = static_cast<uint16_t>(chr_len);
if (chr && chr_len) memcpy(this->data, chr, sizeof(char_t)*chr_len);
this->seq_to = static_cast<uint16_t>(this->chr_to + seq_count * sizeof(key_t) / sizeof(*data));
if (seq && seq_count) memcpy(this->data + this->chr_to, seq, sizeof(key_t)*seq_count);
}
inline const wchar_t* chr () const { return data; };
inline wchar_t* chr () { return data; };
inline const wchar_t* chr_end() const { return data + chr_to; };
inline wchar_t* chr_end() { return data + chr_to; };
inline unsigned __int16 chr_len() const { return chr_to; };
inline const char_t* chr () const { return data; };
inline char_t* chr () { return data; };
inline const char_t* chr_end() const { return data + chr_to; };
inline char_t* chr_end() { return data + chr_to; };
inline uint16_t chr_len() const { return chr_to; };
inline const key_t* seq () const { return reinterpret_cast<const key_t*>(data + chr_to); };
inline key_t* seq () { return reinterpret_cast< key_t*>(data + chr_to); };
inline const key_t* seq_end() const { return reinterpret_cast<const key_t*>(data + seq_to); };
inline key_t* seq_end() { return reinterpret_cast< key_t*>(data + seq_to); };
inline unsigned __int16 seq_len() const { return (seq_to - chr_to) * sizeof(*data) / sizeof(key_t); };
inline const key_t* seq () const { return reinterpret_cast<const key_t*>(data + chr_to); };
inline key_t* seq () { return reinterpret_cast< key_t*>(data + chr_to); };
inline const key_t* seq_end() const { return reinterpret_cast<const key_t*>(data + seq_to); };
inline key_t* seq_end() { return reinterpret_cast< key_t*>(data + seq_to); };
inline uint16_t seq_len() const { return (seq_to - chr_to) * sizeof(*data) / sizeof(key_t); };
///
/// Compares two key sequences
@ -118,7 +118,7 @@ namespace ZRCola {
///
/// Character index
///
class indexChr : public index<unsigned __int16, unsigned __int32, keyseq>
class indexChr : public index<uint16_t, uint32_t, keyseq>
{
public:
///
@ -126,7 +126,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexChr(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, keyseq>(h) {}
indexChr(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, keyseq>(h) {}
///
/// Compares two key sequences by character (for searching)
@ -174,7 +174,7 @@ namespace ZRCola {
///
/// Key index
///
class indexKey : public index<unsigned __int16, unsigned __int32, keyseq>
class indexKey : public index<uint16_t, uint32_t, keyseq>
{
public:
///
@ -182,7 +182,7 @@ namespace ZRCola {
///
/// \param[in] h Reference to vector holding the data
///
indexKey(_In_ std::vector<unsigned __int16> &h) : index<unsigned __int16, unsigned __int32, keyseq>(h) {}
indexKey(_In_ std::vector<uint16_t> &h) : index<uint16_t, uint32_t, keyseq>(h) {}
///
/// Compares two key sequences by key (for searching)
@ -226,7 +226,7 @@ namespace ZRCola {
}
} idxKey; ///< Key index
std::vector<unsigned __int16> data; ///< Key sequences data
std::vector<uint16_t> data; ///< Key sequences data
public:
///
@ -270,86 +270,84 @@ namespace ZRCola {
wxString str;
return GetSequenceAsText(seq, seq_len, str) ? str : wxEmptyString;
}
///
/// Writes key sequence database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Key sequence database
///
/// \returns The stream \p stream
///
friend std::ostream& operator <<(_In_ std::ostream& stream, _In_ const keyseq_db& db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write key index.
if (stream.fail()) return stream;
stream << db.idxKey;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
uint32_t count = (uint32_t)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads key sequence database from a stream
///
/// \param[in] stream Input stream
/// \param[out] db Key sequence database
///
/// \returns The stream \p stream
///
friend std::istream& operator >>(_In_ std::istream& stream, _Out_ keyseq_db& db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read key index.
stream >> db.idxKey;
if (!stream.good()) return stream;
// Read data count.
uint32_t count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(uint16_t) * static_cast<std::streamsize>(count));
}
else
db.data.clear();
return stream;
}
};
typedef stdex::idrec::record<keyseq_db, recordid_t, recordsize_t, ZRCOLA_RECORD_ALIGN> keyseq_rec;
typedef stdex::idrec::record<keyseq_db, recordid_t, 0x59454b /*"KEY"*/, recordsize_t, ZRCOLA_RECORD_ALIGN> keyseq_rec;
};
const ZRCola::recordid_t ZRCola::keyseq_rec::id = *(ZRCola::recordid_t*)"KEY";
///
/// Writes key sequence database to a stream
///
/// \param[in] stream Output stream
/// \param[in] db Key sequence database
///
/// \returns The stream \p stream
///
inline std::ostream& operator <<(_In_ std::ostream& stream, _In_ const ZRCola::keyseq_db &db)
{
// Write character index.
if (stream.fail()) return stream;
stream << db.idxChr;
// Write key index.
if (stream.fail()) return stream;
stream << db.idxKey;
// Write data count.
auto data_count = db.data.size();
#if defined(_WIN64) || defined(__x86_64__) || defined(__ppc64__)
// 4G check
if (data_count > 0xffffffff) {
stream.setstate(std::ios_base::failbit);
return stream;
}
#endif
if (stream.fail()) return stream;
unsigned __int32 count = (unsigned __int32)data_count;
stream.write((const char*)&count, sizeof(count));
// Write data.
if (stream.fail()) return stream;
stream.write((const char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
return stream;
}
///
/// Reads key sequence database from a stream
///
/// \param[in] stream Input stream
/// \param[out] db Key sequence database
///
/// \returns The stream \p stream
///
inline std::istream& operator >>(_In_ std::istream& stream, _Out_ ZRCola::keyseq_db &db)
{
// Read character index.
stream >> db.idxChr;
if (!stream.good()) return stream;
// Read key index.
stream >> db.idxKey;
if (!stream.good()) return stream;
// Read data count.
unsigned __int32 count;
stream.read((char*)&count, sizeof(count));
if (!stream.good()) return stream;
if (count) {
// Read data.
db.data.resize(count);
stream.read((char*)db.data.data(), sizeof(unsigned __int16)*static_cast<std::streamsize>(count));
} else
db.data.clear();
return stream;
}
#pragma warning(pop)

1
lib/oatpp Submodule

@ -0,0 +1 @@
Subproject commit 14ca5e55c8a7c8265b090e1704463c7ab42ca2ee

1
lib/oatpp-swagger Submodule

@ -0,0 +1 @@
Subproject commit ed5251c580e2e98beb50d818bcea8ddc91419d8c

@ -1 +1 @@
Subproject commit 1fbff95bd7fecf80f958c15ab7d0eecdbe35e4cb
Subproject commit c1616b032e9597b072de6fae634ef242a6a67b1d

@ -1 +1 @@
Subproject commit e3a59d1118053ed48dc15b83d17125da87c434dd
Subproject commit 79ec08365068ab6e03b06caef13de0ce6b06fcd5

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.