You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when svlint encounters a non-UTF-8 encoded file, it throws an error and halts the main program, even during the analysis of a file list. I would like to add automatic detection and reading of non-UTF-8 encoded files. Even if the encoding is problematic, the main program should not terminate.
Proposed Solutions:
Modify the read_to_string function in src/main.rs with the following code:
~/code/svlint$ git diff
diff --git a/Cargo.toml b/Cargo.toml
index 2097796..08efc02 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -41,6 +41,8 @@ sv-parser = "0.13.3"
term = "0.7"
toml = "0.8"
sv-filelist-parser = "0.1.3"
+chardetng = "0.1.17"
+encoding_rs = "0.8.34"[build-dependencies]
regex = "1"
diff --git a/src/main.rs b/src/main.rs
index 70bda82..a13200a 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -2,8 +2,9 @@ use anyhow::{Context,Error};use clap::{Parser,CommandFactory};use clap_complete;use enquote;
+use chardetng::EncodingDetector;use std::collections::HashMap;
-use std::fs::{read_to_string,File,OpenOptions};
+use std::fs::{File,OpenOptions};use std::io::{Read,Write};use std::path::{Path,PathBuf};use std::{env, process};
@@ -275,7 +276,16 @@ pubfnrun_opt_config(printer:&mutPrinter,opt:&Opt,config:Config) -> Resul// by textrules to reset their internal state.let _ = linter.textrules_check(TextRuleEvent::StartOfFile,&path,&0);
- let text:String = read_to_string(&path)?;
+ letmut file = File::open(&path)?;
+ letmut buffer = Vec::new();
+
+ file.read_to_end(&mut buffer)?;
+ letmut detector = EncodingDetector::new();
+ detector.feed(&buffer,true);
+ let encoding = detector.guess(None,true).decode(&buffer).0;
+
+ let text = encoding.into_owned();letmut beg:usize = 0;// Iterate over lines in the file, applying each textrule to each
This change might cause a slight performance degradation during file reading, but it is acceptable.
Provide a runtime parameter, such as --guess-encoding. When this parameter is activated, use the above code; otherwise, continue using read_to_string.
If read_to_string fails to read a UTF-8 file, do not print an error. Instead, treat this situation as a rule violation: the code file must be saved in UTF-8 encoding. Report the file path (it's difficult to locate the problematic file in a complex nested file list without the file path).
The text was updated successfully, but these errors were encountered:
Background:
Currently, when
svlint
encounters a non-UTF-8 encoded file, it throws an error and halts the main program, even during the analysis of a file list. I would like to add automatic detection and reading of non-UTF-8 encoded files. Even if the encoding is problematic, the main program should not terminate.Proposed Solutions:
Modify the
read_to_string
function insrc/main.rs
with the following code:This change might cause a slight performance degradation during file reading, but it is acceptable.
Provide a runtime parameter, such as
--guess-encoding
. When this parameter is activated, use the above code; otherwise, continue usingread_to_string
.If
read_to_string
fails to read a UTF-8 file, do not print an error. Instead, treat this situation as a rule violation: the code file must be saved in UTF-8 encoding. Report the file path (it's difficult to locate the problematic file in a complex nested file list without the file path).The text was updated successfully, but these errors were encountered: