-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling HTML Entities #450
Comments
Dug around some more and realized I could fix this if able to pass var xml2js = require('xml2js');
var builder = new xml2js.Builder({
headless: true,
stringify: {
elEscape: function (str) {
var ampregex;
ampregex = this.noDoubleEncoding ? /(?!&\S+;)&/g : /&/g;
return str
.replace(ampregex, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
}
}
});
var xml = "<root>Hello "xml2js"!\r\n</root>";
xml2js.parseString(xml, function (err, result) {
console.log(builder.buildObject(result));
});
// logs: "<root>Hello "xml2js"!\r\n</root>" |
I don't think it makes sense for xml2js to handle these differences, because in a way, there are no differences. Any reasonable XML parser will treat the following inputs exactly the same: <root>"xml2js"!</root>
<root>"xml2js"!</root>
<root>"xml2js"!</root>
<root>"xml2js"!</root>
<root>"xml2js"!</root> Of course, these look like very different strings, but interpreted according to the XML specification, they are equivalent - they simply use different encodings for certain characters. It's the same for whitespace: CRLF can be represented as If you run into problems because of such syntactic differences, you should change your approach to handling XML. Saying that {foo: "bar"}
{foo: 'bar'} |
On the other hand, if xmlbuilder supports it and it's simple to pass through the configuration to xmlbuilder, why not? |
@jcsahnwaldt While technically these are equivalent but for encodings, the fact remains that other systems may treat them differently. The main use case for XML is data exchange, and unfortunately (despite your assessment of "any reasonable XML parser...") -- we don't always have control over how other systems generate and consume their XML. I expect that's why xmlbuilder gives options on how to encode and serialize. Since it's easy to pass through those options, I'd recommend making it easy to do. |
Basically the same thing: #165 |
@jcsahnwaldt Since I've prepped PR #451 to address this, would you be open to reviewing that PR? Sounds like it would close this and the other issue I duplicated... |
I agree with @jcsahnwaldt here. The fact which dependencies xml2js uses should not be what defines xml2js's public interface so passing in options to xmlbuilder-js breaks the layering (in a Law of Demeter kind of way). Especially since as it was said, the XML generated is valid and in correct parsers equivalent. |
I'm using xml2js to transform an XML file that I don't have much control over. I don't want to make any changes to the XML besides the specific changes I'm trying to make.
I'm running into an issue with HTML entities where
build(parse(xml))
is not a reversible process. Easiest to show with an example:As you can see, there are two issues here:
"
in the input is transformed to\"
\r
in the input is transformed to#xD;
I've been looking through the api options and can't figure out a way to hook in and change this behavior. Any ideas, or is this a bug?
The text was updated successfully, but these errors were encountered: