-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added BOM sniffing check to file load #565
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,11 @@ | ||
use std::env; | ||
use std::error::Error; | ||
use std::fs; | ||
use std::io; | ||
use std::io::Read; | ||
use std::path::PathBuf; | ||
|
||
use encoding_rs_io::DecodeReaderBytesBuilder; | ||
|
||
use crate::file::{ | ||
format::ALL_EXTENSIONS, source::FileSourceResult, FileSource, FileStoredFormat, Format, | ||
}; | ||
|
@@ -114,8 +116,15 @@ where | |
.and_then(|base| pathdiff::diff_paths(&filename, base)) | ||
.unwrap_or_else(|| filename.clone()); | ||
|
||
// Read contents from file | ||
let text = fs::read_to_string(filename)?; | ||
let mut text = String::new(); | ||
|
||
DecodeReaderBytesBuilder::new() | ||
// On Windows, we need to check for added Byte Order Mark (BOM). To | ||
// do this, we don't specify an encoding, and enable BOM sniffing. | ||
.encoding(None) | ||
.bom_sniffing(true) | ||
.build(std::fs::File::open(filename)?) | ||
.read_to_string(&mut text)?; | ||
Comment on lines
+121
to
+127
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if I'm reading the docs correctly, this does more than BOM sniffing / stripping but also handles other encodings. That should be a separate question, discussed in an issue first. This also then feels like a heavy weight dependency just for handling BOM. |
||
|
||
Ok(FileSourceResult { | ||
uri: Some(uri.to_string_lossy().into_owned()), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this should have tests. We ask that the PR be split into two commits
In this case, I would want to have the first commit have a test that is parsing content with a BOM and to show the failure that happens. The follow up commit should then strip the BOM and update the test to not error anymore.