-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken link to a ProGit book page presented via search #1642
Comments
thanks for reporting the issue @sivaraam . would you be interested in helping to solve the issue? |
Thanks for asking @pedrorijo91 ! |
I would like to work on this please, but I need a bit if guidance. I reproduced the issue but there based on the URL there should be a https://git-scm.com/book/en in the repo and I am unable to find this folder in this repo. Can some one please give me a hint where to find the git-scm.com/book/en/ document folder? |
sure @C-Lion ! so, the book routing is made through https://github.com/git/git-scm.com/blob/main/app/controllers/books_controller.rb#L6 if we look into the setup at https://github.com/git/git-scm.com/blob/main/README.md#setup, we'll see how is the book content imported - it uses the now we need to dig into https://github.com/git/git-scm.com/blob/main/lib/tasks/book2.rake#L32 to find where are we adding the chapter title to the search index, and make sure we URL encode it |
I suspect the problem is actually in the search code, not the book importer. In the model for a book section, we index the content and provide an git-scm.com/app/models/section.rb Lines 80 to 89 in 688a6c9
And then when we do a search, the section model inherits from Searchable, which formats the results: Lines 47 to 62 in 688a6c9
So one of those probably needs to be URL-encoding things (you can see in the section model that we URL-encode the slug in other contexts when generating links). I'm not sure which place makes more sense. It doesn't look like we use that id for anything else, but maybe there are rules we need to follow for ElasticSearch (otherwise why would it do that weird |
I think I understand the issue but it is not clear from your information which of the above needs to be corrected or what exactly you want changed. |
Right. The problem is that the URL is not found in this repository at all. It is in content which is imported from another repository (the book code). Hence the complexity. A cron job pulls in updated book content into the sql database nightly, and then we run a search index on that content, putting the result into the elasticsearch database. And then incoming search requests query the elasticsearch database. So if we are going to add a layer of quoting to the URL, it needs to happen either when we index the content and stick it into elasticsearch, or when we pull it out and return it to the browser. |
After #1804 was merged, searching for |
Searching for keyword presented a result which turned out to be a broken link.
URL for broken page
https://git-scm.com/book/en/Appendix-A:-Git-in-Other-Environments-Git-in-IntelliJ-/-PyCharm-/-WebStorm-/-PhpStorm-/-RubyMine
Problem
When I searched for "intelliJ" in the search bar, I got the result for "Appendix A: Git in Other Environments - Git in IntelliJ / PyCharm / WebStorm / PhpStorm / RubyMine" section of the ProGit book. When I clicked on it though, it resulted in a 404 for the following URL:
I guess the culprit is that the URL should be encoded but it isn't. When I try to access the section by going to https://git-scm.com/book/en/v2 and clicking on the link to the same section, it works fine.
Operating system and browser
Firefox on Windows 10
Steps to reproduce
The text was updated successfully, but these errors were encountered: