miller/docs6/docs/_build/html/operating-on-all-fields.html

162 lines
No EOL
7.9 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Operating on all fields &#8212; Miller 6.0.0-alpha documentation</title>
<link rel="stylesheet" href="_static/scrolls.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="_static/print.css" type="text/css" />
<script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
<script src="_static/jquery.js"></script>
<script src="_static/underscore.js"></script>
<script src="_static/doctools.js"></script>
<script src="_static/language_data.js"></script>
<script src="_static/theme_extras.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Special symbols and formatting" href="special-symbols-and-formatting.html" />
<link rel="prev" title="Shapes of data" href="shapes-of-data.html" />
</head><body>
<div id="content">
<div class="header">
<h1 class="heading"><a href="index.html"
title="back to the documentation overview"><span>Operating on all fields</span></a></h1>
</div>
<div class="relnav" role="navigation" aria-label="related navigation">
<a href="shapes-of-data.html">&laquo; Shapes of data</a> |
<a href="#">Operating on all fields</a>
| <a href="special-symbols-and-formatting.html">Special symbols and formatting &raquo;</a>
</div>
<div id="contentwrapper">
<div id="toc" role="navigation" aria-label="table of contents navigation">
<h3>Table of Contents</h3>
<ul>
<li><a class="reference internal" href="#">Operating on all fields</a><ul>
<li><a class="reference internal" href="#bulk-rename-of-fields">Bulk rename of fields</a></li>
<li><a class="reference internal" href="#search-and-replace-over-all-fields">Search-and-replace over all fields</a></li>
<li><a class="reference internal" href="#full-field-renames-and-reassigns">Full field renames and reassigns</a></li>
</ul>
</li>
</ul>
</div>
<div role="main">
<div class="section" id="operating-on-all-fields">
<h1>Operating on all fields<a class="headerlink" href="#operating-on-all-fields" title="Permalink to this headline"></a></h1>
<div class="section" id="bulk-rename-of-fields">
<h2>Bulk rename of fields<a class="headerlink" href="#bulk-rename-of-fields" title="Permalink to this headline"></a></h2>
<p>Suppose you want to replace spaces with underscores in your column names:</p>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/spaces.csv
</span> a b c,def,g h i
123,4567,890
2468,1357,3579
9987,3312,4543
</pre></div>
</div>
<p>The simplest way is to use <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">rename</span></code> with <code class="docutils literal notranslate"><span class="pre">-g</span></code> (for global replace, not just first occurrence of space within each field) and <code class="docutils literal notranslate"><span class="pre">-r</span></code> for pattern-matching (rather than explicit single-column renames):</p>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv rename -g -r &#39; ,_&#39; data/spaces.csv
</span> a_b_c,def,g_h_i
123,4567,890
2468,1357,3579
9987,3312,4543
</pre></div>
</div>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv --opprint rename -g -r &#39; ,_&#39; data/spaces.csv
</span> a_b_c def g_h_i
123 4567 890
2468 1357 3579
9987 3312 4543
</pre></div>
</div>
<p>You can also do this with a for-loop:</p>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/bulk-rename-for-loop.mlr
</span> map newrec = {};
for (oldk, v in $*) {
newrec[gsub(oldk, &quot; &quot;, &quot;_&quot;)] = v;
}
$* = newrec
</pre></div>
</div>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --icsv --opprint put -f data/bulk-rename-for-loop.mlr data/spaces.csv
</span> a_b_c def g_h_i
123 4567 890
2468 1357 3579
9987 3312 4543
</pre></div>
</div>
</div>
<div class="section" id="search-and-replace-over-all-fields">
<h2>Search-and-replace over all fields<a class="headerlink" href="#search-and-replace-over-all-fields" title="Permalink to this headline"></a></h2>
<p>How to do <code class="docutils literal notranslate"><span class="pre">$name</span> <span class="pre">=</span> <span class="pre">gsub($name,</span> <span class="pre">&quot;old&quot;,</span> <span class="pre">&quot;new&quot;)</span></code> for all fields?</p>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/sar.csv
</span> a,b,c
the quick,brown fox,jumped
over,the,lazy dogs
</pre></div>
</div>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/sar.mlr
</span> for (k in $*) {
$[k] = gsub($[k], &quot;e&quot;, &quot;X&quot;);
}
</pre></div>
</div>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr --csv put -f data/sar.mlr data/sar.csv
</span> a,b,c
thX quick,brown fox,jumpXd
ovXr,thX,lazy dogs
</pre></div>
</div>
</div>
<div class="section" id="full-field-renames-and-reassigns">
<h2>Full field renames and reassigns<a class="headerlink" href="#full-field-renames-and-reassigns" title="Permalink to this headline"></a></h2>
<p>Using Miller 5.0.0s map literals and assigning to <code class="docutils literal notranslate"><span class="pre">$*</span></code>, you can fully generalize <a class="reference internal" href="reference-verbs.html#reference-verbs-rename"><span class="std std-ref">mlr rename</span></a>, <a class="reference internal" href="reference-verbs.html#reference-verbs-reorder"><span class="std std-ref">mlr reorder</span></a>, etc.</p>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> cat data/small
</span> a=pan,b=pan,i=1,x=0.3467901443380824,y=0.7268028627434533
a=eks,b=pan,i=2,x=0.7586799647899636,y=0.5221511083334797
a=wye,b=wye,i=3,x=0.20460330576630303,y=0.33831852551664776
a=eks,b=wye,i=4,x=0.38139939387114097,y=0.13418874328430463
a=wye,b=pan,i=5,x=0.5732889198020006,y=0.8636244699032729
</pre></div>
</div>
<div class="highlight-none notranslate"><div class="highlight"><pre><span></span><span class="hll"> mlr put &#39;
</span><span class="hll"> begin {
</span><span class="hll"> @i_cumu = 0;
</span><span class="hll"> }
</span><span class="hll">
</span><span class="hll"> @i_cumu += $i;
</span><span class="hll"> $* = {
</span><span class="hll"> &quot;z&quot;: $x + y,
</span><span class="hll"> &quot;KEYFIELD&quot;: $a,
</span><span class="hll"> &quot;i&quot;: @i_cumu,
</span><span class="hll"> &quot;b&quot;: $b,
</span><span class="hll"> &quot;y&quot;: $x,
</span><span class="hll"> &quot;x&quot;: $y,
</span><span class="hll"> };
</span><span class="hll"> &#39; data/small
</span> z=0.3467901443380824,KEYFIELD=pan,i=1,b=pan,y=0.3467901443380824,x=0.7268028627434533
z=0.7586799647899636,KEYFIELD=eks,i=3,b=pan,y=0.7586799647899636,x=0.5221511083334797
z=0.20460330576630303,KEYFIELD=wye,i=6,b=wye,y=0.20460330576630303,x=0.33831852551664776
z=0.38139939387114097,KEYFIELD=eks,i=10,b=wye,y=0.38139939387114097,x=0.13418874328430463
z=0.5732889198020006,KEYFIELD=wye,i=15,b=pan,y=0.5732889198020006,x=0.8636244699032729
</pre></div>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="footer" role="contentinfo">
&#169; Copyright 2021, John Kerl.
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 3.2.1.
</div>
</body>
</html>