Given a list of identities with the following properties:
byr (Birth Year)
iyr (Issue Year)
eyr (Expiration Year)
hgt (Height)
hcl (Hair Color)
ecl (Eye Color)
pid (Passport ID)
cid (Country ID)
Validate them, where cid is optional.
Example:
ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
byr:1937 iyr:2017 cid:147 hgt:183cm
iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
hcl:#cfa07d byr:1929
hcl:#ae17e1 iyr:2013
eyr:2024
ecl:brn pid:760753108 byr:1931
hgt:179cm
hcl:#cfa07d eyr:2025 pid:166559648
iyr:2011 ecl:brn hgt:59in
This at first sight seems to be a perfect opportunity to get another parser out! What's interesting to not is that the values come in in any order. I see two approaches;
- Take lines, then words, sort them, then take all the values and validate them individually for presence only (ie. remove cid if there is one, then check the length). However, there could be a spanner in the works when they start including other fields than the 8 declared.
- Write parsers, then combine them into a single one that populates a data field. If we do that over the list, we should get a list of valid passwords.
The input is quite malformed. So simply using lines
won't work. I will have to break up the string by empty lines and remove the eol characters myself...
So first part is finished. Instead of breaking up the empty lines, I decided to parse them. I very quickly got to the part that I got 1875 'correct' properties (ie. hcl, iyr...). But how they were distributed amonst passports was a mystery. It took me way to long to figure out that my individual parsers were taking spaces
at the end, and that that made my other parser (made to parse a single passport) to eager, basically parsing all the passports at once...
It also took me quite a while to realise that manyTill
actually doesnt return what the many
parsers return, but what the till
parser returns. So I was getting a bunch of 0's in my output. Once I figured that out and used the parser state to keep track of things instead of the parser return values, I got it to work.
I guess I was lucky going with the parsers in the first place. It made it relatively easy to add the validation for each item. I'm pretty sure there are better ways to do this validation. Currently I built a higher order function for the more generic variants. The ones that need to parse a bit more specifically, I swapped out.
I'm getting the hang of parsers! That's super nice. There were some lessons learned. Simple things, relatively speaking, but things that we're quite tricky if you don't know them. Like, parsers need to keep peeling away data, if one of the parsers fails, it fails the whole stream. Right now I adapted a sort of Parse / Validate way for the full file. In a more real world example you would probably not get this malformed POS data file, but get it at least where one line resembles one passport. Then, you would be able to go through the list and parse the passwords one by one, instead of the way I'm doing it now where I'm trying to parse the whole file of passwords at once.