-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is tools/ for? #140
Comments
The file in unicodeTools.py seems to be from here: So this should be partially attributed to Unicode Consortium. According to this file it seems to be DFSG free as well: |
Is used to regenerate unicodeTools.py when a new Unicode version comes out, does not need to be packaged. |
I guess you could do that indeed. |
I did a rewrite of the loading part of the file: unicodeTools.py Please check if that works as intended. Note that this is for using the files externally. Feel free if you want to backport it. |
|
The problem is that, in Debian we need to strip the duplicated files and prefer ones provided in the repository. This does not need to be in the upstream (and that's why I didn't file a pull request.) However, if it is possible, could you separate the embedded texts from Unicode into some text files? In this way we can replace the files and symlink them to be provided by another package. |
How do you guarantee that the file provided by another package is the expected version of Unicode? |
Typically we use package dependency to guarantee that.
However if upstream code expects the specific Unicode version we have to do
extra work to upload another version of unicode-data.
…On Thu, 26 Oct 2017 at 17:18 Denis Moyogo Jacquerye < ***@***.***> wrote:
How do you guarantee that the file provided by another package is the
expected version of Unicode?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#140 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEi8Z3QEq0Sp74dsH6O76EYGkANKv3qks5swE5NgaJpZM4QGPFi>
.
|
The "UnicodeData.txt" file in This is the diff between the I don't know why it had to be tweaked, maybe @typesupply knows. Let me know if I'm understanding this issue correctly. There's a UnicodeData.txt file in tools; is the problem the fact that the file is there unused, or is it that it doesn't come with an appropriate license file? What do you mean by "DFSG free"? I'm not familiar with these things so any help is welcome. The unicodeTools.py module embeds the content of "ftp://www.unicode.org/Public/9.0.0/ucd/Scripts.txt" file from Unicode Consortium. You would prefer it to be as a separate data file, because there's already one in Debian repository as a separate package and prefer to avoid duplicating them, correct? ~ btw, this reminded me that there's a pending PR which updates it to Unicode 10 which I forgot to review |
@medicalwei maybe you could send a pull request? |
Because some open characters have closed partner characters that aren't defined in UnicodeData.txt. For example,
In UnicodeData.txt, I'm open to moving the exceptions to the generator to make this more clear. |
The original issue here is that the Unicode data has its own license which wasn't clearly marked here. DFSG is the Debian Free Software Guidelines. @medicalwei 's comment was that code or content licensed with the Unicode license are acceptable for inclusion in Debian. Debian has a policy that the same piece of code not be duplicated in Debian if possible. Now, I believe the Unicode data isn't "code" but I thought it was worth asking whether the duplication was necessary here. Debian updated its version of unicode-data to 10.0.0 very quickly after it was released in June. |
would it be enough to include the text of http://www.unicode.org/copyright.html#License in a file called "LICENSE" next to the unicode data files?
I don't know. That data file is only used once a year, and I wouldn't like to complicate the setup too much. Would it be ok if we placed the Scripts.txt and Blocks.txt outside the unicodeTools.py module, and put them as separate text files like the UnicodeData.txt, and then from unicodeTools.py we would have a global variable e.g. |
This is fine with me. As long as the module continues to work as is, I have no opinion on where the source data is located. |
I think upstream can simply move them to a dedicated text files and packagers can replace the files with symbolic links. No need for a global variable. (@jbicha correct me if the policy doesn't allow this.) But as you stated there are differences for the open-close data from the generated script. I propose doing this with a patch ( |
This is to address part of the problems in robotools#140 that some open parentheses does not match the closed ones. These would only leave special exceptions and ornate parentheses (which is because of reversed order in the data) needed to be appended.
With the new |
As long as we have backwards compatibility with the functions in defcon I'd be very happy to ditch the UCD data parsing. |
About the issue of the built-in When unicodedata2 is importable, |
…data2 backport so that defcon.unicodeTools are up-to-date and use Unicode Character Database 11.0
Based on the changes to where unicodedata is being pulled from, this needs to be looked at again to retain the exceptions, but perhaps remove the /tools completely? I'm not 100% sure what |
What is https://github.com/typesupply/defcon/tree/master/tools ?
It doesn't appear to be used in the build at all.
Is it used to generate https://github.com/typesupply/defcon/blob/master/Lib/defcon/tools/unicodeTools.py ? If so, shouldn't that be part of the build instead of including generated files in source?
I'm asking because I'm working with @medicalwei on packaging this for Debian and the Unicode data is technically under a different license.
The text was updated successfully, but these errors were encountered: