codespell

2026-01-23 02:14:13 +00:00 · 2022-08-20 09:29:44 -04:00 · 2022-08-20 09:29:44 -04:00 · d8be06b6bb
commit d8be06b6bb
parent 7c9d0e291d
4 changed files with 26 additions and 33 deletions
--- a/.github/workflows/codespell.yml
+++ b/.github/workflows/codespell.yml
@ -33,7 +33,4 @@ jobs:
        with:
          check_filenames: true
          ignore_words_file: .codespellignore
-          # ignore_words_list: denom,inout,iput,nd,nin,numer,te,wee
-          # There is a word "RO" in docs/src/shapes-of-data.md.in and docs/src/shapes-of-data.md
-          # which is listed in .codespellignore but which codespell refuses to ignore. Not sure why.
          skip: "*.csv,*.dkvp,*.txt,*.js,*.html,*.map,./tags,./test/cases,./docs/src/shapes-of-data.md.in,./docs/src/shapes-of-data.md"
--- a/docs/src/data/colours.csv
+++ b/docs/src/data/colours.csv
@ -1,3 +1,3 @@
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR
 masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
--- a/docs/src/shapes-of-data.md
+++ b/docs/src/shapes-of-data.md
@ -36,7 +36,7 @@ Use the `file` command to see if there are CR/LF terminators (in this case, ther
 <b>file data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-data/colours.csv: UTF-8 Unicode text
+data/colours.csv: Unicode text, UTF-8 text
 </pre>

 Look at the file to find names of fields:
@ -45,18 +45,15 @@ Look at the file to find names of fields:
 <b>cat data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR
-masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR
+masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 </pre>

 Extract a few fields:

-<pre class="pre-highlight-in-pair">
-<b>mlr --csv cut -f KEY,PL,RO data/colours.csv </b>
-</pre>
-<pre class="pre-non-highlight-in-pair">
-(only blank lines appear)
+<pre class="pre-highlight-non-pair">
+<b>mlr --csv cut -f KEY,PL,TO data/colours.csv </b>
 </pre>

 Use XTAB output format to get a sharper picture of where records/fields are being split:
@ -65,12 +62,12 @@ Use XTAB output format to get a sharper picture of where records/fields are bein
 <b>mlr --icsv --oxtab cat data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz

-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 </pre>

-Using XTAB output format makes it clearer that `KEY;DE;...;RO;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):
+Using XTAB output format makes it clearer that `KEY;DE;...;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):

 <pre class="pre-highlight-in-pair">
 <b>mlr --icsv --ifs semicolon --oxtab cat data/colours.csv </b>
@ -83,9 +80,9 @@ ES  Blanco
 FI  Valkoinen
 FR  Blanc
 IT  Bianco
-NL  Witter
+NL  Wit
 PL  Biały
-RO  Alb
+TO  Alb
 TR  Beyaz

 KEY masterdata_colourcode_2
@ -97,17 +94,17 @@ FR  Noir
 IT  Nero
 NL  Zwart
 PL  Czarny
-RO  Negru
+TO  Negru
 TR  Siyah
 </pre>

 Using the new field-separator, retry the cut:

 <pre class="pre-highlight-in-pair">
-<b>mlr --csv --fs semicolon cut -f KEY,PL,RO data/colours.csv </b>
+<b>mlr --csv --fs semicolon cut -f KEY,PL,TO data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-KEY;PL;RO
+KEY;PL;TO
 masterdata_colourcode_1;Biały;Alb
 masterdata_colourcode_2;Czarny;Negru
 </pre>
--- a/docs/src/shapes-of-data.md.in
+++ b/docs/src/shapes-of-data.md.in
@ -18,35 +18,34 @@ Use the `file` command to see if there are CR/LF terminators (in this case, ther

 GENMD-CARDIFY-HIGHLIGHT-ONE
 file data/colours.csv 
-data/colours.csv: UTF-8 Unicode text
+data/colours.csv: Unicode text, UTF-8 text
 GENMD-EOF

 Look at the file to find names of fields:

 GENMD-CARDIFY-HIGHLIGHT-ONE
 cat data/colours.csv 
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR
-masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR
+masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 GENMD-EOF

 Extract a few fields:

 GENMD-CARDIFY-HIGHLIGHT-ONE
-mlr --csv cut -f KEY,PL,RO data/colours.csv 
-(only blank lines appear)
+mlr --csv cut -f KEY,PL,TO data/colours.csv 
 GENMD-EOF

 Use XTAB output format to get a sharper picture of where records/fields are being split:

 GENMD-CARDIFY-HIGHLIGHT-ONE
 mlr --icsv --oxtab cat data/colours.csv 
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz

-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 GENMD-EOF

-Using XTAB output format makes it clearer that `KEY;DE;...;RO;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):
+Using XTAB output format makes it clearer that `KEY;DE;...;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):

 GENMD-CARDIFY-HIGHLIGHT-ONE
 mlr --icsv --ifs semicolon --oxtab cat data/colours.csv 
@ -57,9 +56,9 @@ ES  Blanco
 FI  Valkoinen
 FR  Blanc
 IT  Bianco
-NL  Witter
+NL  Wit
 PL  Biały
-RO  Alb
+TO  Alb
 TR  Beyaz

 KEY masterdata_colourcode_2
@ -71,15 +70,15 @@ FR  Noir
 IT  Nero
 NL  Zwart
 PL  Czarny
-RO  Negru
+TO  Negru
 TR  Siyah
 GENMD-EOF

 Using the new field-separator, retry the cut:

 GENMD-CARDIFY-HIGHLIGHT-ONE
-mlr --csv --fs semicolon cut -f KEY,PL,RO data/colours.csv 
-KEY;PL;RO
+mlr --csv --fs semicolon cut -f KEY,PL,TO data/colours.csv 
+KEY;PL;TO
 masterdata_colourcode_1;Biały;Alb
 masterdata_colourcode_2;Czarny;Negru
 GENMD-EOF