mirror of
https://github.com/johnkerl/miller.git
synced 2026-01-23 02:14:13 +00:00
162 lines
No EOL
14 KiB
HTML
162 lines
No EOL
14 KiB
HTML
|
||
<!DOCTYPE html>
|
||
|
||
<html>
|
||
<head>
|
||
<meta charset="utf-8" />
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||
<title>Unix-toolkit context — Miller 5.10.2 documentation</title>
|
||
<link rel="stylesheet" href="_static/classic.css" type="text/css" />
|
||
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
|
||
|
||
<script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
|
||
<script src="_static/jquery.js"></script>
|
||
<script src="_static/underscore.js"></script>
|
||
<script src="_static/doctools.js"></script>
|
||
<script src="_static/language_data.js"></script>
|
||
|
||
<link rel="index" title="Index" href="genindex.html" />
|
||
<link rel="search" title="Search" href="search.html" />
|
||
<link rel="next" title="File formats" href="file-formats.html" />
|
||
<link rel="prev" title="Miller in 10 minutes" href="10min.html" />
|
||
</head><body>
|
||
<div class="related" role="navigation" aria-label="related navigation">
|
||
<h3>Navigation</h3>
|
||
<ul>
|
||
<li class="right" style="margin-right: 10px">
|
||
<a href="genindex.html" title="General Index"
|
||
accesskey="I">index</a></li>
|
||
<li class="right" >
|
||
<a href="file-formats.html" title="File formats"
|
||
accesskey="N">next</a> |</li>
|
||
<li class="right" >
|
||
<a href="10min.html" title="Miller in 10 minutes"
|
||
accesskey="P">previous</a> |</li>
|
||
<li class="nav-item nav-item-0"><a href="index.html">Miller 5.10.2 documentation</a> »</li>
|
||
<li class="nav-item nav-item-this"><a href="">Unix-toolkit context</a></li>
|
||
</ul>
|
||
</div>
|
||
|
||
<div class="document">
|
||
<div class="documentwrapper">
|
||
<div class="bodywrapper">
|
||
<div class="body" role="main">
|
||
|
||
<div class="section" id="unix-toolkit-context">
|
||
<h1>Unix-toolkit context<a class="headerlink" href="#unix-toolkit-context" title="Permalink to this headline">¶</a></h1>
|
||
<p>How does Miller fit within the Unix toolkit (<cite>grep</cite>, <cite>sed</cite>, <cite>awk</cite>, etc.)?</p>
|
||
<div class="section" id="file-format-awareness">
|
||
<h2>File-format awareness<a class="headerlink" href="#file-format-awareness" title="Permalink to this headline">¶</a></h2>
|
||
<p>Miller respects CSV headers. If you do <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">--csv</span> <span class="pre">cat</span> <span class="pre">*.csv</span></code> then the header line is written once:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ cat data/a.csv
|
||
a,b,c
|
||
1,2,3
|
||
4,5,6
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ cat data/b.csv
|
||
a,b,c
|
||
7,8,9
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ mlr --csv cat data/a.csv data/b.csv
|
||
a,b,c
|
||
1,2,3
|
||
4,5,6
|
||
7,8,9
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ mlr --csv sort -nr b data/a.csv data/b.csv
|
||
a,b,c
|
||
7,8,9
|
||
4,5,6
|
||
1,2,3
|
||
</pre></div>
|
||
</div>
|
||
<p>Likewise with <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">sort</span></code>, <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">tac</span></code>, and so on.</p>
|
||
</div>
|
||
<div class="section" id="awk-like-features-mlr-filter-and-mlr-put">
|
||
<h2>awk-like features: mlr filter and mlr put<a class="headerlink" href="#awk-like-features-mlr-filter-and-mlr-put" title="Permalink to this headline">¶</a></h2>
|
||
<ul class="simple">
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">filter</span></code> includes/excludes records based on a filter expression, e.g. <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">filter</span> <span class="pre">'$count</span> <span class="pre">></span> <span class="pre">10'</span></code>.</p></li>
|
||
<li><p><code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">put</span></code> adds a new field as a function of others, e.g. <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">put</span> <span class="pre">'$xy</span> <span class="pre">=</span> <span class="pre">$x</span> <span class="pre">*</span> <span class="pre">$y'</span></code> or <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">put</span> <span class="pre">'$counter</span> <span class="pre">=</span> <span class="pre">NR'</span></code>.</p></li>
|
||
<li><p>The <code class="docutils literal notranslate"><span class="pre">$name</span></code> syntax is straight from <code class="docutils literal notranslate"><span class="pre">awk</span></code>’s <code class="docutils literal notranslate"><span class="pre">$1</span> <span class="pre">$2</span> <span class="pre">$3</span></code> (adapted to name-based indexing), as are the variables <code class="docutils literal notranslate"><span class="pre">FS</span></code>, <code class="docutils literal notranslate"><span class="pre">OFS</span></code>, <code class="docutils literal notranslate"><span class="pre">RS</span></code>, <code class="docutils literal notranslate"><span class="pre">ORS</span></code>, <code class="docutils literal notranslate"><span class="pre">NF</span></code>, <code class="docutils literal notranslate"><span class="pre">NR</span></code>, and <code class="docutils literal notranslate"><span class="pre">FILENAME</span></code>. The <code class="docutils literal notranslate"><span class="pre">ENV[...]</span></code> syntax is from Ruby.</p></li>
|
||
<li><p>While <code class="docutils literal notranslate"><span class="pre">awk</span></code> functions are record-based, Miller subcommands (or <em>verbs</em>) are stream-based: each of them maps a stream of records into another stream of records.</p></li>
|
||
<li><p>Like <code class="docutils literal notranslate"><span class="pre">awk</span></code>, Miller (as of v5.0.0) allows you to define new functions within its <code class="docutils literal notranslate"><span class="pre">put</span></code> and <code class="docutils literal notranslate"><span class="pre">filter</span></code> expression language. Further programmability comes from chaining with <code class="docutils literal notranslate"><span class="pre">then</span></code>.</p></li>
|
||
<li><p>As with <code class="docutils literal notranslate"><span class="pre">awk</span></code>, <code class="docutils literal notranslate"><span class="pre">$</span></code>-variables are stream variables and all verbs (such as <code class="docutils literal notranslate"><span class="pre">cut</span></code>, <code class="docutils literal notranslate"><span class="pre">stats1</span></code>, <code class="docutils literal notranslate"><span class="pre">put</span></code>, etc.) as well as <code class="docutils literal notranslate"><span class="pre">put</span></code>/<code class="docutils literal notranslate"><span class="pre">filter</span></code> statements operate on streams. This means that you define actions to be done on each record and then stream your data through those actions. The built-in variables <code class="docutils literal notranslate"><span class="pre">NF</span></code>, <code class="docutils literal notranslate"><span class="pre">NR</span></code>, etc. change from one line to another, <code class="docutils literal notranslate"><span class="pre">$x</span></code> is a label for field <code class="docutils literal notranslate"><span class="pre">x</span></code> in the current record, and the input to <code class="docutils literal notranslate"><span class="pre">sqrt($x)</span></code> changes from one record to the next. The expression language for the <code class="docutils literal notranslate"><span class="pre">put</span></code> and <code class="docutils literal notranslate"><span class="pre">filter</span></code> verbs additionally allows you to define <code class="docutils literal notranslate"><span class="pre">begin</span> <span class="pre">{...}</span></code> and <code class="docutils literal notranslate"><span class="pre">end</span> <span class="pre">{...}</span></code> blocks for actions to be taken before and after records are processed, respectively.</p></li>
|
||
<li><p>As with <code class="docutils literal notranslate"><span class="pre">awk</span></code>, Miller’s <code class="docutils literal notranslate"><span class="pre">put</span></code>/<code class="docutils literal notranslate"><span class="pre">filter</span></code> language lets you set <code class="docutils literal notranslate"><span class="pre">@sum=0</span></code> before records are read, then update that sum on each record, then print its value at the end. Unlike <code class="docutils literal notranslate"><span class="pre">awk</span></code>, Miller makes syntactically explicit the difference between variables with extent across all records (names starting with <code class="docutils literal notranslate"><span class="pre">@</span></code>, such as <code class="docutils literal notranslate"><span class="pre">@sum</span></code>) and variables which are local to the current expression (names starting without <code class="docutils literal notranslate"><span class="pre">@</span></code>, such as <code class="docutils literal notranslate"><span class="pre">sum</span></code>).</p></li>
|
||
<li><p>Miller can be faster than <code class="docutils literal notranslate"><span class="pre">awk</span></code>, <code class="docutils literal notranslate"><span class="pre">cut</span></code>, and so on, depending on platform; see also <a class="reference internal" href="performance.html"><span class="doc">Performance</span></a>. In particular, Miller’s DSL syntax is parsed into C control structures at startup time, with the bulk data-stream processing all done in C.</p></li>
|
||
</ul>
|
||
</div>
|
||
<div class="section" id="see-also">
|
||
<h2>See also<a class="headerlink" href="#see-also" title="Permalink to this headline">¶</a></h2>
|
||
<p>See <a class="reference internal" href="reference-verbs.html"><span class="doc">Verbs reference</span></a> for more on Miller’s subcommands <code class="docutils literal notranslate"><span class="pre">cat</span></code>, <code class="docutils literal notranslate"><span class="pre">cut</span></code>, <code class="docutils literal notranslate"><span class="pre">head</span></code>, <code class="docutils literal notranslate"><span class="pre">sort</span></code>, <code class="docutils literal notranslate"><span class="pre">tac</span></code>, <code class="docutils literal notranslate"><span class="pre">tail</span></code>, <code class="docutils literal notranslate"><span class="pre">top</span></code>, and <code class="docutils literal notranslate"><span class="pre">uniq</span></code>, as well as <a class="reference internal" href="reference-dsl.html"><span class="doc">DSL reference</span></a> for more on the awk-like <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">filter</span></code> and <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">put</span></code>.</p>
|
||
</div>
|
||
</div>
|
||
|
||
|
||
<div class="clearer"></div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
|
||
<div class="sphinxsidebarwrapper">
|
||
<h3><a href="index.html">Table of Contents</a></h3>
|
||
<ul>
|
||
<li><a class="reference internal" href="#">Unix-toolkit context</a><ul>
|
||
<li><a class="reference internal" href="#file-format-awareness">File-format awareness</a></li>
|
||
<li><a class="reference internal" href="#awk-like-features-mlr-filter-and-mlr-put">awk-like features: mlr filter and mlr put</a></li>
|
||
<li><a class="reference internal" href="#see-also">See also</a></li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
|
||
<h4>Previous topic</h4>
|
||
<p class="topless"><a href="10min.html"
|
||
title="previous chapter">Miller in 10 minutes</a></p>
|
||
<h4>Next topic</h4>
|
||
<p class="topless"><a href="file-formats.html"
|
||
title="next chapter">File formats</a></p>
|
||
<div role="note" aria-label="source link">
|
||
<h3>This Page</h3>
|
||
<ul class="this-page-menu">
|
||
<li><a href="_sources/feature-comparison.rst.txt"
|
||
rel="nofollow">Show Source</a></li>
|
||
</ul>
|
||
</div>
|
||
<div id="searchbox" style="display: none" role="search">
|
||
<h3 id="searchlabel">Quick search</h3>
|
||
<div class="searchformwrapper">
|
||
<form class="search" action="search.html" method="get">
|
||
<input type="text" name="q" aria-labelledby="searchlabel" />
|
||
<input type="submit" value="Go" />
|
||
</form>
|
||
</div>
|
||
</div>
|
||
<script>$('#searchbox').show(0);</script>
|
||
</div>
|
||
</div>
|
||
<div class="clearer"></div>
|
||
</div>
|
||
<div class="related" role="navigation" aria-label="related navigation">
|
||
<h3>Navigation</h3>
|
||
<ul>
|
||
<li class="right" style="margin-right: 10px">
|
||
<a href="genindex.html" title="General Index"
|
||
>index</a></li>
|
||
<li class="right" >
|
||
<a href="file-formats.html" title="File formats"
|
||
>next</a> |</li>
|
||
<li class="right" >
|
||
<a href="10min.html" title="Miller in 10 minutes"
|
||
>previous</a> |</li>
|
||
<li class="nav-item nav-item-0"><a href="index.html">Miller 5.10.2 documentation</a> »</li>
|
||
<li class="nav-item nav-item-this"><a href="">Unix-toolkit context</a></li>
|
||
</ul>
|
||
</div>
|
||
<div class="footer" role="contentinfo">
|
||
© Copyright 2020, John Kerl.
|
||
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 3.2.1.
|
||
</div>
|
||
</body>
|
||
</html> |