-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU stuff #115
ICU stuff #115
Conversation
Do we actually need that whole m4 script? Can we just ask |
Agreed. Just do it like https://github.com/apertium/lexd/blob/master/configure.ac#L19-L20 And I see |
Also, I don't think a normalization tool belongs in lttoolbox - that's something we probably want to adjust separately, so a repo of its own would be nice. |
So far it looks like |
@@ -113,20 +100,20 @@ class AttCompiler | |||
|
|||
Alphabet alphabet; | |||
/** All non-multicharacter symbols. */ | |||
set<wchar_t> letters; | |||
set<UChar> letters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future optimization: flat_set or sorted_vector
LGTM |
@@ -179,7 +179,7 @@ Compiler::procAlphabet() | |||
bool space = true; | |||
for(unsigned int i = 0; i < letters.length(); i++) | |||
{ | |||
if(!isspace(letters.at(i))) | |||
if(!u_isspace(letters.at(i))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future work: None of our codebases should have .at()
.
(really this is an error state either way, but I think this is slightly more correct)
ICU changes (closes #81)
std::wstring
withUString
(=std::basic_string<UChar>
)InputFile
wrapper to handle UTF-8 streams with nullsefficiency, readability, and code style changes
Ltstr
andstring_to_wostream
int32_t
rather thanint
Transducer
std::vector
tostd::list
.clear()
and.empty()
to= ""
and== ""
regex_compiler
iterate over the input string rather than modifying itTransducer::determinize()
helper function and dependency changes
StringUtils
here from apertiumXMLParseUtil
functions more specific to their typical usecasesxml_walk_util.h
for cleanly iterating over children ofxmlNode*