mirror of
https://github.com/johnkerl/miller.git
synced 2026-01-24 02:36:15 +00:00
62 lines
No EOL
5 KiB
HTML
62 lines
No EOL
5 KiB
HTML
|
||
<!DOCTYPE html>
|
||
|
||
<html>
|
||
<head>
|
||
<meta charset="utf-8" />
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||
<title>Introduction — Miller 6.0.0-alpha documentation</title>
|
||
|
||
<link rel="stylesheet" href="_static/scrolls.css" type="text/css" />
|
||
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
|
||
<link rel="stylesheet" href="_static/print.css" type="text/css" />
|
||
|
||
<script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
|
||
<script src="_static/jquery.js"></script>
|
||
<script src="_static/underscore.js"></script>
|
||
<script src="_static/doctools.js"></script>
|
||
<script src="_static/language_data.js"></script>
|
||
<script src="_static/theme_extras.js"></script>
|
||
<link rel="index" title="Index" href="genindex.html" />
|
||
<link rel="search" title="Search" href="search.html" />
|
||
<link rel="next" title="Miller in 10 minutes" href="10min.html" />
|
||
<link rel="prev" title="Miller Documentation" href="index.html" />
|
||
</head><body>
|
||
<div id="content">
|
||
<div class="header">
|
||
<h1 class="heading"><a href="index.html"
|
||
title="back to the documentation overview"><span>Introduction</span></a></h1>
|
||
</div>
|
||
<div class="relnav" role="navigation" aria-label="related navigation">
|
||
<a href="index.html">« Miller Documentation</a> |
|
||
<a href="#">Introduction</a>
|
||
| <a href="10min.html">Miller in 10 minutes »</a>
|
||
</div>
|
||
<div id="contentwrapper">
|
||
<div role="main">
|
||
|
||
<div class="section" id="introduction">
|
||
<h1>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h1>
|
||
<p><strong>Miller is a command-line tool for querying, shaping, and reformatting data files in various formats including CSV and JSON.</strong></p>
|
||
<p>In several senses, Miller is more than one tool:</p>
|
||
<p><strong>Format conversion:</strong> You can convert CSV files to JSON, or vice versa, or
|
||
pretty-print your data horizontally or vertically to make it easier to read.</p>
|
||
<p><strong>Data manipulation:</strong> With a few keystrokes you can remove columns you don’t care about – or, make new ones using expressions like <code class="docutils literal notranslate"><span class="pre">$rate</span> <span class="pre">=</span> <span class="pre">$units</span> <span class="pre">/</span> <span class="pre">$seconds</span></code>.</p>
|
||
<p><strong>Pre-processing/post-processing vs standalone use:</strong> You can use Miller to clean data files and put them into standard formats, perhaps in preparation to load them into a database or a hands-off data-processing pipeline. Or you can use it post-process and summary database-query output. As well, you can use Miller to explore and analyze your data interactively.</p>
|
||
<p><strong>Compact verbs vs programming language:</strong> For low-keystroking you can do things like <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">--csv</span> <span class="pre">sort</span> <span class="pre">-f</span> <span class="pre">name</span> <span class="pre">input.csv</span></code> or <code class="docutils literal notranslate"><span class="pre">mlr</span> <span class="pre">--json</span> <span class="pre">head</span> <span class="pre">-n</span> <span class="pre">1</span> <span class="pre">myfile.json</span></code>. The <code class="docutils literal notranslate"><span class="pre">sort</span></code>, <code class="docutils literal notranslate"><span class="pre">head</span></code>, etc are called <em>verbs</em>. They’re analogs of familiar command-line tools like <code class="docutils literal notranslate"><span class="pre">sort</span></code>, <code class="docutils literal notranslate"><span class="pre">head</span></code>, and so on – but they’re aware of name-indexed, multi-line file formats like CSV and JSON. In addition, though, using Miller’s <code class="docutils literal notranslate"><span class="pre">put</span></code> verb you can use programming-language statements for expressions like <code class="docutils literal notranslate"><span class="pre">$rate</span> <span class="pre">=</span> <span class="pre">$units</span> <span class="pre">/</span> <span class="pre">$seconds</span></code> which allow you to succintly express your own logic.</p>
|
||
<p><strong>Multiple domains:</strong> People use Miller for data analysis, data science, software engineering, devops/system-administration, journalism, scientific research, and more.</p>
|
||
<p>In the following (color added for the illustration) you can see how CSV, tabular, JSON, and other <strong>file formats</strong> share a common theme which is <strong>lists of key-value-pairs</strong>. Miller embraces this common theme.</p>
|
||
<img alt="_images/cover-combined.png" src="_images/cover-combined.png" />
|
||
</div>
|
||
|
||
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="footer" role="contentinfo">
|
||
© Copyright 2021, John Kerl.
|
||
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 3.2.1.
|
||
</div>
|
||
</body>
|
||
</html> |