Fix issue where newlines are stripped in <pre> blocks #165

lkwdwrd · 2015-12-06T20:16:47Z

Right now the way the parser is importing <pre> blocks, newlines are all stripped. This is perfect when inserting normal paragraph content because it will get run through wp_autop() and stuff will break. However, inside <pre> blocks, the intent is that the whitespace is preserved.

This PR breaks out the processing of newlines into it's own helper function which is sensitive to content inside of <pre> blocks.

atimmer · 2015-12-06T22:00:04Z

This is almost the same as what @coffee2code has been using when parsing for .org. At the contributor day of WordCamp US @lkwdwrd, @coffee2code and me had a discussions about this. We all agree that this solution is kinda hacky and considered if there might be a better alternative.

One alternative solution to this is removing the line break stripping and disabling wpautop in the devhub theme. This would also be better when using the parser output in plain HTML. This has other challenges, but it feels much better.

@DrewAPicture What do you think about the alternative solution?

lkwdwrd · 2015-12-08T22:07:37Z

I've been doing some thinking about a line of conversation we were having involving somehow using the reference to be used to push patches out to fix/update docs. Spit-balling a bit here. As far as data structuring goes, it may make sense then to make use of the post_content_filtered field in the posts table and store the raw docblock content there (pre-markdown processed) and switch to read-only post content.

If we do that, we can display parsed content with autop disabled since Parsedown adds <p> tags anyway. If it does need editing, we can pull the raw text from the post_content_filtered and put it in a normal textarea for the edit and run it through Parsedown to create the new revision. Requires writers to write in markdown, but I'm not sure that's a bad thing. Doing it this way means we can likely create a patch file too that could be applied since we know what file it comes from and what lines the code is on.

thoughts?

atimmer · 2015-12-11T15:52:34Z

This would be my ideal solution if we can make it work on .org.

atimmer · 2015-12-29T21:02:16Z

See also #114 for more history.

lkwdwrd · 2015-12-29T21:22:45Z

I've been exploring this in a branch and it's working pretty well. As I've explored the approach though, more questions keep coming up. Before I start making decisions and just coding them, I figure I'll reach out and ask.

Should we just capture the raw long description text, or the long description text and the raw docblcok?
My current branch stores both. Easier to use the raw description, but if we do want to eventually create patches with this, we'll want the raw docblock too. I have the raw description in post_content_filtered and the raw docblock is in meta.

Should the parser remove wpautop() from the_content, or leave that to the implementation?
For display I removed autop from the_content filter and attached a function that checks for parser post types and runs wpautop() on anything not in the list. This works great - but should it reside in the parser itself? I did this implementation in the devhub theme, it's not currently in my branch.

For .org the approach works well because the content is already hidden with a filter in explanations and content is editable through the metabox provided in the parsed-content feature after attaching a trac ticket. Some small updates to the theme would likely be needed to make the parsed-content feature work well (it's using the WYSIWYG for content editing currently), and expecting wpautop(). But some of the above questions will directly affect the approach. This will set up the feature for raw markdown support via parsedown, as well as the eventual patch-creation feature we discussed at WCUS.

Before I go further down this path can I get some thoughts from @atimmer @coffee2code and @DrewAPicture, or anyone else who wants to weight in?

lkwdwrd · 2016-01-07T05:40:36Z

Some updates here.

I decided it's probably best to strip autop in the parser since out of the box I feel like 'the_content' should just work. It's not really a tool for display so much as compatibility support is how I'm thinking about it. I've updated the branch to match this decision, but haven't updated the tests, hence no PR yet. I'll close this one and open a new one as soon as it's ready, but I'll leave this open for now as a placeholder.

This will also require an update to DevHub since it's dealing with some of this content. It also runs a new filter on the_content since I know the parser isn't always activated there. You can see an example of what that patch might look like at http://d.pr/f/156GM. I'll need to post this on Trac as well when the tests are passing and I'm ready for the PR.

lkwdwrd · 2016-01-08T04:53:55Z

This is now replaced by #173

This is hacky. It predates a similar approach proposed in #165 and is not the more complete solution proposed in #173, but it'll at least sync the code with that used in production. Fixes #114, see #165, see #173.

Fix issue where newlines are stripped in <pre> blocks

a62a73d

atimmer added the bug label Dec 6, 2015

lkwdwrd closed this Jan 8, 2016

lkwdwrd mentioned this pull request Jan 8, 2016

Stop stripping new lines and instead make parsed post types post_content private #173

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue where newlines are stripped in <pre> blocks #165

Fix issue where newlines are stripped in <pre> blocks #165

lkwdwrd commented Dec 6, 2015

atimmer commented Dec 6, 2015

lkwdwrd commented Dec 8, 2015

atimmer commented Dec 11, 2015

atimmer commented Dec 29, 2015

lkwdwrd commented Dec 29, 2015

lkwdwrd commented Jan 7, 2016

lkwdwrd commented Jan 8, 2016

Fix issue where newlines are stripped in <pre> blocks #165

Fix issue where newlines are stripped in <pre> blocks #165

Conversation

lkwdwrd commented Dec 6, 2015

atimmer commented Dec 6, 2015

lkwdwrd commented Dec 8, 2015

atimmer commented Dec 11, 2015

atimmer commented Dec 29, 2015

lkwdwrd commented Dec 29, 2015

lkwdwrd commented Jan 7, 2016

lkwdwrd commented Jan 8, 2016