Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't add styles in the output document when converting with the option --reference-doc #10088

Open
me-kell opened this issue Aug 14, 2024 · 6 comments

Comments

@me-kell
Copy link

me-kell commented Aug 14, 2024

Don't add any other styles in the output document other than the ones existing in the input document when converting with the option --reference-doc (and the extension +styles)

Currently when converting a DOCX-Document with the +styles extension and itself as --reference-doc with

pandoc input.docx -f docx+styles -t docx -o output.docx --reference-doc input.docx

following styles are added in the output document (which were not in the input document):

AlertTok , AnnotationTok , AttributeTok , BaseNTok , BuiltInTok , CharTok , CommentTok , CommentVarTok , ConstantTok , ControlFlowTok , DataTypeTok , DecValTok , DocumentationTok , ErrorTok , ExtensionTok , FloatTok , FunctionTok , ImportTok , InformationTok , KeywordTok , NormalTok , OperatorTok , OtherTok, PreprocessorTok , RegionMarkerTok , SourceCode , SpecialCharTok , SpecialStringTok , StringTok , VariableTok , VerbatimStringTok , WarningTok

The input.docx is an empty document created with a "clean" Normal.dotm.

Is there a way to disable the creation of those styles when the --reference-doc option is given?

@me-kell
Copy link
Author

me-kell commented Aug 14, 2024

Pandoc is also adding some settings in word/settings.xml not existing in the input document:

	<w:displayHorizontalDrawingGridEvery w:val="0"/>
	<w:displayVerticalDrawingGridEvery w:val="0"/>
	<w:doNotTrackMoves/>
	<w:drawingGridHorizontalSpacing w:val="360"/>
	<w:drawingGridVerticalSpacing w:val="360"/>
	<w:embedSystemFonts/>
	<w:footnotePr>
		<w:footnote w:id="0"/>
		<w:footnote w:id="-1"/>
	</w:footnotePr>
	<w:hyphenationZone w:val="425"/>
	<w:listSeparator w:val=";"/>
	<w:proofState w:grammar="clean" w:spelling="clean"/>
	<w:rsids/>
	<w:savePreviewPicture/>
	<w:stylePaneFormatFilter w:val="0004"/>

@me-kell
Copy link
Author

me-kell commented Aug 14, 2024

AFAICS Pandoc uses the files in data/doc.

When --reference-doc my_reference_doc.docx option is passed to pandoc, why not using the files in my_reference_doc.docx instead of those in data/doc?

@jgm
Copy link
Owner

jgm commented Aug 15, 2024

We do carry over some things from the reference.docx. But if we just used everything, we'd get corrupt files (tried that; see e.g. #9522). So we use a conservative approach to guarantee that the docx we produced is not corrupt. It may be that we can be less conservative about some things. See also #7240.

@jgm
Copy link
Owner

jgm commented Aug 15, 2024

Here is the code relevant to generating settings.xml:

https://github.com/jgm/pandoc/blob/main/src/Text/Pandoc/Writers/Docx.hs#L474-L577

@jgm
Copy link
Owner

jgm commented Aug 15, 2024

The styles AlertTok , AnnotationTok , AttributeTok , BaseNTok , BuiltInTok , CharTok , CommentTok , CommentVarTok , ConstantTok , ControlFlowTok , DataTypeTok , DecValTok , DocumentationTok , ErrorTok , ExtensionTok , FloatTok , FunctionTok , ImportTok , InformationTok , KeywordTok , NormalTok , OperatorTok , OtherTok, PreprocessorTok , RegionMarkerTok , SourceCode , SpecialCharTok , SpecialStringTok , StringTok , VariableTok , VerbatimStringTok , WarningTok are for syntax highlighting. They are generated and depend on the highlighting style you specify. If you specify --no-highlight, they should not appear.

@jgm
Copy link
Owner

jgm commented Aug 15, 2024

PS. Also please state your pandoc version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants