-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decide on file structure and naming #2
Comments
I think keeping the original files is a good idea (perhaps in the I don't think the intermediate files need to be kept though (the steps from original |
I agree we shouldn't keep the new intermediate files. And I agree we should keep the original text file. Maybe we should specify all of the available source file types in metadata.yml? |
I was thinking there should be a "gitenberg-status" field in the metadata, but please no list of files.
|
So, the overall structure could look something like:
Where PG handle file modifications by creating copies of files... should we do the same here? I know we don't need to because we've got Git storing the history, but it might make things more transparent and perhaps make pushing things back upstream easier? I'm not sure. |
One modification of how things currently are, An html file generated by us from |
I think that 1234.html and 1234.* is a mistake. they should get generic names like content.html reasons:
|
Good point about more generic filenames, but it does raise the point about dependence on PG. My understanding has been that Gitenberg is a sort of subproject of PG and so it's okay to rely on their identifiers. Works that are not yet in PG should be submitted there, be given an identifier, and then imported here. Maybe? I'm not sure! On 5 April 2015 22:12:37 GMT+08:00, eshellman [email protected] wrote:
Sent from my Android device with K-9 Mail. Please excuse my brevity. |
Gitenberg is more of an independent research project. Most of the texts in PG come from Distributed Proofreaders, which has elaborate process controls to ensure that PG ebooks can be produced from the html files they emit. Gitenberg has a stronger need to integrate with DP processes, if successful, than any need to use pg ids as filenames, which are assigned only after the finished files are accepted into PG. Here are the DP initial processes, from http://www.pgdp.net/wiki/Guiguts_PP_Process_Checklist
So the key thing here is that a bookname is selected and used as the directory name. If you wanted to continue the practice of also using [bookname] to name text files, you'd need to come up with a reserved list of all the filenames that bookname couldn't be. easier to use generic names, I think. |
See @rdhyee's questions on Dubliners PR #1:
GITenberg/Dubliners_2814#1
The text was updated successfully, but these errors were encountered: