<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Pipes: using unix pipelines for beautiful answers to quick and dirty questions</title>
	<atom:link href="http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend/</link>
	<description>Articles from the Engineering Group at Palantir Technologies</description>
	<lastBuildDate>Mon, 02 Aug 2010 02:02:26 -0700</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Ari</title>
		<link>http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend/comment-page-1/#comment-8</link>
		<dc:creator>Ari</dc:creator>
		<pubDate>Tue, 13 Mar 2007 18:34:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend-do-not-be-afraid-little-one/#comment-8</guid>
		<description>Absolutely.  But as with all things, &lt;em&gt;the right tool for the right job&lt;/em&gt; has a lot to do with the job.  Performance considerations will pull people back towards Java or C/C++.  Interfacing directly with hardware will call out a need for C/C++, etc.  In this case, shell was the right choice because it was fast, short, and I knew exactly how to accomplish it in that environment.

For someone not as steeped in shell programming as I am, it might have easier to accomplish in Perl, Python, or even C or Java.  However, when evaluating what is the right tool for the job, we have to assume some high level of knowledge of the available tools. I happen to know all of the above tools and I&#039;m pretty certain that shell was the right way to go here.  (I&#039;d love to see a different approach that proved me wrong, however!)</description>
		<content:encoded><![CDATA[<p>Absolutely.  But as with all things, <em>the right tool for the right job</em> has a lot to do with the job.  Performance considerations will pull people back towards Java or C/C++.  Interfacing directly with hardware will call out a need for C/C++, etc.  In this case, shell was the right choice because it was fast, short, and I knew exactly how to accomplish it in that environment.</p>
<p>For someone not as steeped in shell programming as I am, it might have easier to accomplish in Perl, Python, or even C or Java.  However, when evaluating what is the right tool for the job, we have to assume some high level of knowledge of the available tools. I happen to know all of the above tools and I&#8217;m pretty certain that shell was the right way to go here.  (I&#8217;d love to see a different approach that proved me wrong, however!)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: quikchange</title>
		<link>http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend/comment-page-1/#comment-7</link>
		<dc:creator>quikchange</dc:creator>
		<pubDate>Tue, 13 Mar 2007 15:09:51 +0000</pubDate>
		<guid isPermaLink="false">http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend-do-not-be-afraid-little-one/#comment-7</guid>
		<description>Fair enough. The fact that it takes only 3 seconds is key though. If it took 2 minutes because there was a much larger number of items to loop over then the trade-off may not have been as convenient. 

Interestingly, your argument for using shell instead of C is basically the same one for using Python or Ruby instead of Java, C#, etc.</description>
		<content:encoded><![CDATA[<p>Fair enough. The fact that it takes only 3 seconds is key though. If it took 2 minutes because there was a much larger number of items to loop over then the trade-off may not have been as convenient. </p>
<p>Interestingly, your argument for using shell instead of C is basically the same one for using Python or Ruby instead of Java, C#, etc.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ari</title>
		<link>http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend/comment-page-1/#comment-6</link>
		<dc:creator>Ari</dc:creator>
		<pubDate>Mon, 12 Mar 2007 19:01:04 +0000</pubDate>
		<guid isPermaLink="false">http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend-do-not-be-afraid-little-one/#comment-6</guid>
		<description>It&#039;s true that shell scripts are slow compared to any given compiled language.  However, their interactivity, compactness, power, and speed of iteration is hard to beat.  This is especially true when integrating data from multiple sources (like the output of &lt;code&gt;diff&lt;/code&gt; and &lt;code&gt;svn&lt;/code&gt;) into a single compact report. 

I prototyped the above command, literally, in ten minutes start-to-finish and then emailed out the results to my team.  Rather than having to fire up an editor and deal with a compile/run/edit loop I was editing and running a command line: &lt;i&gt;hit up arrow, edit line, hit enter&lt;/i&gt;.   Lather, rinse, repeat until you see what you want.

And there&#039;s another point to remember: I don&#039;t even know (offhand) how do to do this in a compiled language.  Writing that utility would require me to understand the &lt;a href=&quot;http://en.wikipedia.org/wiki/API&quot; rel=&quot;nofollow&quot;&gt;API&lt;/a&gt; that&#039;s available for each of those tools.  Subversion has a public API, and there are countless regex libraries for every language, but I&#039;m not sure how one would do the &lt;code&gt;diff&lt;/code&gt; using library code (a cursory Google search didn&#039;t show anything too promising). All of which is not to say that I have a fear of learning a new API, but I already know how to use these tools as using them is core part of what I do.  Learning the Subversion API and reimplementing &lt;code&gt;diff&lt;/code&gt; is not.

And finally: writing a compiled solution would require a lot more code (even assuming that you wouldn&#039;t have to re-implement &lt;code&gt;diff&lt;/code&gt;).  Note that the pipeline is only 42 words and 269 characters long. Even if you were to implement the pipeline as nothing more than &lt;code&gt;system()&lt;/code&gt; calls so you could leverage the power of &lt;code&gt;grep&lt;/code&gt; and &lt;code&gt;diff&lt;/code&gt;, you&#039;d still end up with a program at least three times the size.

So it ends up being a net win in efficiency; the iteration speed and compactness are pretty key.  The time for it to run is about 3 seconds.  The time to develop a minimal compiled solution is about 30 minutes.  20 * 60 / 3 = 400.  So I get to run it about 400 times before this solution costs me anything over a compiled solution.  And I get to keep the flexibility editing the command line to tweak what I want to see in the report.  (Exercise for the reader: add timestamps to the report lines that indicate the last modified time).

So while I agree that the shell is slow at times, it&#039;s often the fastest route to getting the information you seek.</description>
		<content:encoded><![CDATA[<p>It&#8217;s true that shell scripts are slow compared to any given compiled language.  However, their interactivity, compactness, power, and speed of iteration is hard to beat.  This is especially true when integrating data from multiple sources (like the output of <code>diff</code> and <code>svn</code>) into a single compact report. </p>
<p>I prototyped the above command, literally, in ten minutes start-to-finish and then emailed out the results to my team.  Rather than having to fire up an editor and deal with a compile/run/edit loop I was editing and running a command line: <i>hit up arrow, edit line, hit enter</i>.   Lather, rinse, repeat until you see what you want.</p>
<p>And there&#8217;s another point to remember: I don&#8217;t even know (offhand) how do to do this in a compiled language.  Writing that utility would require me to understand the <a href="http://en.wikipedia.org/wiki/API" rel="nofollow">API</a> that&#8217;s available for each of those tools.  Subversion has a public API, and there are countless regex libraries for every language, but I&#8217;m not sure how one would do the <code>diff</code> using library code (a cursory Google search didn&#8217;t show anything too promising). All of which is not to say that I have a fear of learning a new API, but I already know how to use these tools as using them is core part of what I do.  Learning the Subversion API and reimplementing <code>diff</code> is not.</p>
<p>And finally: writing a compiled solution would require a lot more code (even assuming that you wouldn&#8217;t have to re-implement <code>diff</code>).  Note that the pipeline is only 42 words and 269 characters long. Even if you were to implement the pipeline as nothing more than <code>system()</code> calls so you could leverage the power of <code>grep</code> and <code>diff</code>, you&#8217;d still end up with a program at least three times the size.</p>
<p>So it ends up being a net win in efficiency; the iteration speed and compactness are pretty key.  The time for it to run is about 3 seconds.  The time to develop a minimal compiled solution is about 30 minutes.  20 * 60 / 3 = 400.  So I get to run it about 400 times before this solution costs me anything over a compiled solution.  And I get to keep the flexibility editing the command line to tweak what I want to see in the report.  (Exercise for the reader: add timestamps to the report lines that indicate the last modified time).</p>
<p>So while I agree that the shell is slow at times, it&#8217;s often the fastest route to getting the information you seek.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: quikchange</title>
		<link>http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend/comment-page-1/#comment-4</link>
		<dc:creator>quikchange</dc:creator>
		<pubDate>Mon, 12 Mar 2007 04:16:36 +0000</pubDate>
		<guid isPermaLink="false">http://blog.palantirtech.com/2007/02/07/unix-pipelines-are-your-friend-do-not-be-afraid-little-one/#comment-4</guid>
		<description>Looping in the shell tends to be at least an order of magnitude slower than it would be in a compiled language, which I find too frustrating for frequent interactive use.</description>
		<content:encoded><![CDATA[<p>Looping in the shell tends to be at least an order of magnitude slower than it would be in a compiled language, which I find too frustrating for frequent interactive use.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
