Skip to content

Commit d7e4839

Browse files
authored
Merge pull request #157 from hathitrust/issue-156-checksum-bom
open checksum file in utf8 mode and strip BOM
2 parents 4e19de2 + 333c9e8 commit d7e4839

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

lib/HTFeed/Volume.pm

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -286,9 +286,13 @@ sub get_checksum_md5 {
286286
$self->set_error("MissingFile", file => $checksum_file);
287287
}
288288

289-
open(FILE, $checksum_path) or die("Can't open $checksum_path: $!");
289+
# we don't expect a BOM in checksum files, but get it sometimes anyways
290+
# so it's best to open files as if they were UTF8 and strip bom if seen
291+
use open ':std', ':encoding(UTF-8)';
292+
open(FILE, $checksum_path) or die("Can't open $checksum_path: $!");
290293
foreach my $line (<FILE>) {
291294
$line =~ s/\r\n$/\n/;
295+
$line =~ s/^\N{BOM}//;
292296
chomp($line);
293297
# ignore malformed lines
294298
next unless $line =~ /^([a-fA-F0-9]{32})(\s+\*?)(\S.*)/;

0 commit comments

Comments
 (0)