<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Palantir Technologies &#187; coding</title>
	<atom:link href="http:///category/coding/feed/" rel="self" type="application/rss+xml" />
	<link></link>
	<description>Articles from the Engineering Group at Palantir Technologies</description>
	<lastBuildDate>Wed, 14 Dec 2011 17:48:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Introducing Palantir&#8217;s first open source releases</title>
		<link>http://blog.palantirtech.com/2011/12/14/introducing-palantirs-first-open-source-releases/</link>
		<comments>http://blog.palantirtech.com/2011/12/14/introducing-palantirs-first-open-source-releases/#comments</comments>
		<pubDate>Wed, 14 Dec 2011 17:28:31 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[Java Links]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[swing]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1956</guid>
		<description><![CDATA[We&#8217;re big fans of open source. Libraries from Apache, Google, and various projects hosted on SourceForge.net make up a significant fraction of the third-party code we use to build our products. We&#8217;re proud to be making our first set of open source releases with these two projects: Cinch and Sysmon. We think it&#8217;s the right [...]]]></description>
			<content:encoded><![CDATA[<div style='float: left; text-align:right; margin-left:15px; margin-right: 20px; margin-bottom: 10px; margin-top: 10px;'><img src="/wp-content/uploads/2011/12/palantir-ptoss.png" alt="Palantir Technologies Open Source" title="Palantir Technologies Open Source" width='85px'/></div>
<p>We&#8217;re big fans of <a href="http://www.opensource.org/">open source</a>. Libraries from <a href="http://apache.org/">Apache</a>, <a href="https://code.google.com/p/guava-libraries/">Google</a>, and various projects hosted on <a href="http://sourceforge.net/">SourceForge.net</a> make up a significant fraction of the third-party code we use to build our products.</p>
<p>We&#8217;re proud to be making our first set of open source releases with these two projects: <a href="http://github.com/palantir/Cinch">Cinch</a> and <a href="http://github.com/palantir/Sysmon">Sysmon</a>.</p>
<p>We think it&#8217;s the right thing to do, to add our voice to the chorus of developers making software available to freely use, modify, and distribute. These two projects represent our first dip into the open source water &#8211; we&#8217;re just getting started.  As time and other interests allow, we&#8217;ll be making other projects available to the dev community.</p>
<p>We&#8217;ve chosen the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a> to make our contributions as free from encumberance as possible &#8211; our hope is that many people will find them useful and build on top of them just as we have with our own software.</p>
<h2>The Projects</h2>
<div style='float: right; text-align:right; margin-left:15px; width: 253px;margin-bottom: 10px; margin-top: 10px'><img src="/wp-content/uploads/2011/12/cinch-screenshot.png" alt="code editor showing Cinch annotations" title="code editor showing Cinch annotations" width='233'/></div>
<h3><a href="http://github.com/palantir/Cinch">Cinch</a> &#8211; Cinch makes <a href="https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">MVC</a> in Swing easy</h3>
<p>Cinch is a Java library for simplifying certain types of GUI code. When developing Swing applications it&#8217;s easy to fall into the trap of not separating out <a href="https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">Models and Controllers</a>. It&#8217;s all too easy to just store the state of that boolean in the checkbox itself, or that String in the JTextField. The design goal behind Cinch was to make it easier to apply MVC than to not by reducing much of the typical Swing friction and boilerplate. Cinch uses Java annotations to reflectively wire up Models, Views, and Controllers.</p>
<p>Already in heavy use inside the Palantir Government product, Cinch changes GUI development in Java to be similar to iOS and OS X&#8217;s <a href="https://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/CocoaBindings/Concepts/WhatAreBindings.html#//apple_ref/doc/uid/20002372-CJBEJBHH">Cocoa, where annotations are used to bind controls to fields</a>.</p>
<div style='float: right; text-align:right; margin-left:15px; width: 253px'><img src="http://blog.palantir.com/wp-content/uploads/2009/02/monitoringserverscreenshot-badge.png" alt="Graph of CPU usage over time" title="Graph of CPU usage over time" width="233" height="188"/></div>
<h3><a href="http://github.com/palantir/Sysmon">Sysmon</a> &#8211; A lightweight platform monitoring tool for Java VMs</h3>
<p>Sysmon is a lightweight platform monitoring tool. It was designed to gather performance data (CPU, disks, network, etc.) from the host running the Java VM. This data is gathered, packaged, and published via Java Management Extensions (<a href="http://www.oracle.com/technetwork/java/javase/tech/javamanagement-140525.html">JMX</a>) for access using the JMX APIs and standard tools (such as <a href="http://download.oracle.com/javase/6/docs/technotes/guides/management/jconsole.html">jconsole</a>). Sysmon can be run as a standalone daemon or as a library to add platform monitoring to any application.   </p>
<p>Originally built as component in our <a href="http://blog.palantir.com/2009/02/23/palantir-monitoring-server-where-build-beats-buy/">Palantir cluster monitoring server</a>, this project should be helpful in scenarios where you need to get data off a host platform and into a VM.</p>
<h2>Let us know how we&#8217;re doing</h2>
<p>We&#8217;d love to hear from you on how we&#8217;re doing.  Aside from the normal outlets to communicate about the projects themselves (see the mailing lists and issue trackers for each project), please feel free to email me directly, <a href='mailto:agesher@palantir.com'>Ari Gesher</a>, as the curator of these projects.  </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/12/14/introducing-palantirs-first-open-source-releases/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The Coding Interview</title>
		<link>http://blog.palantirtech.com/2011/10/03/the-coding-interview/</link>
		<comments>http://blog.palantirtech.com/2011/10/03/the-coding-interview/#comments</comments>
		<pubDate>Mon, 03 Oct 2011 23:12:07 +0000</pubDate>
		<dc:creator>Allen Chang</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[interviewing]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[palantirtech]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[tips and tricks]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1925</guid>
		<description><![CDATA[Note: this part is part two of our series on doing your best in interviews. Part one: &#8220;How to Rock an Algorithms Interview&#8221;. Here at Palantir algorithms are important, but code is our lifeblood. We live and die by the quality of the code we ship. It’s no surprise, then, that coding ability is what [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 10px; margin-bottom: 10px'><img src="/wp-content/uploads/2011/09/einstein_coding_interview.jpg" alt="Einstein Coding Interview Joke Image" title="einstein_coding_interview" width="300"/></div>
<p><span style='font-size: 0.7em'><em>Note: this part is part two of our series on doing your best in interviews.  Part one: <a href="/2011/09/26/how-to-rock-an-algorithms-interview/" title="How to Rock an Algorithms Interview" target="_blank">&#8220;How to Rock an Algorithms Interview&#8221;</a>.</em></span></p>
<p>Here at Palantir algorithms are important, but code is our lifeblood. We live and die by the quality of the code we ship. It’s no surprise, then, that coding ability is what we stress the most in our interview process. A candidate can get by with mediocre algorithm skills (depending on the role), but no one can skimp on coding.</p>
<p>Suppose you&#8217;re confident in your ability to write great software. Your task in a coding interview (of which there will be several) is to show the interviewers that you in fact do have the programming chops — that you&#8217;re an experienced coder who knows how to write solid, production-quality code.</p>
<p>This is easier said than done. After all, coding in your <a href="http://eclipse.org/">favorite IDE</a> from the comfort of <code>$familiar_place</code> is very different from coding on a whiteboard (on a problem you&#8217;re totally unfamiliar with) in a pressure-filled 45-minute interview. We realize that the interview environment is not the real world, and we adjust our expectations accordingly. Nonetheless, there are a number of things you can do to put your best foot forward during the interview.</p>
<p>First, though, we&#8217;d like to give you a sense for what we look for during a coding interview. Most important is the ability to write clean <strong>and</strong> correct code &mdash; it&#8217;s not enough just to be correct. A lot of people will be interacting with your code once you&#8217;re on the job, so it should be readable, maintainable, and extensible where appropriate. If your solution is clean and correct, and you produced it in a reasonable amount of time without a lot of help, you&#8217;re in good shape. But even if you stumble a bit, there are other ways to demonstrate your ability. As you work, we also watch for debugging ability, problem-solving and analytical skills, creativity, and an understanding of the ecosystem that surrounds production code.</p>
<p>With our evaluation criteria in mind, here are some suggestions we hope will help you perform at your very best.</p>
<p><span id="more-1925"></span></p>
<h2>Before you start coding</h2>
<ul>
<li><strong>Make sure you understand the problem.</strong> Don&#8217;t hesitate to ask questions. Specifically, if any of the problem requirements seem loosely defined or otherwise unclear, ask your interviewer to make things more concrete. There is no penalty for asking for clarifications, and you don&#8217;t want to miss a key requirement or proceed on unfounded assumptions.</li>
<li><strong>Work through simple examples.</strong> This can be useful both before you begin and after you&#8217;ve finished coding. Working through simple examples before coding can give you additional clarity on the nature of the problem — it may help you notice additional cases or patterns in the problem that you would otherwise have missed had you been thinking more abstractly.</li>
<li><strong>Make a plan.</strong> Be wary of jumping into code without thinking about your program&#8217;s high-level structure. You don&#8217;t have to work out every last detail (this can be difficult for more meaty problems), but you should give the matter sufficient thought. Without proper planning, you may be forced to waste your limited time reworking significant parts of your program.</li>
<li><strong>Choose a language.</strong> At Palantir, we don&#8217;t care what languages you know as long as you have a firm grasp on the fundamentals (decomposition, object-oriented design, etc.). That said, you need to be able to communicate with your interviewer, so choose something that both of you can understand. In general, it&#8217;s easier for us if you use Java or C++, but we&#8217;ll try to accommodate other languages. If all else fails, <a href="http://lolcode.com/">devise your own pseudo-code</a>. Just make sure it&#8217;s precise (i.e. not hand-wavy) and internally consistent, and explain your choices as you go.</li>
</ul>
<h2>While you&#8217;re coding</h2>
<ul>
<li><strong>Think out loud.</strong> Explain your thought process to your interviewer as you code. This helps you more fully communicate your solution, and gives your interviewer an opportunity to correct misconceptions or otherwise provide high-level guidance.</li>
<li><strong>Break the problem down and define abstractions.</strong> One crucial skill we look for is the ability to handle complexity by breaking problems into manageable sub-problems. For anything non-trivial, you&#8217;ll want to avoid writing one giant, monolithic function. Feel free to define helper functions, helper classes, and other abstractions to reach a working solution. You can leverage design patterns or other programming idioms as well. Ideally, your solution will be well-factored and as a result easy to read, understand, and prove correct.</li>
<li><strong>Delay the implementation of your helper functions.</strong> (this serves a corollary to the previous point) Write out the signature, and make sure you understand the contract your helper will enforce, but don&#8217;t implement it right away. This serves a number of purposes: (1) it shows that you&#8217;re familiar with abstractions (by treating the method as an API); (2) it allows you to maintain momentum towards the overall solution; (3) it results in fewer context-switches for your brain (you can reason about each level of the call stack separately); and (4) your interviewer may grant you the implementation for free, if he or she considers it trivial.</li>
<li><strong>Don&#8217;t get caught up in trivialities.</strong> At Palantir we are much more interested in your general problem solving and coding abilities than your recall of library function names or obscure language syntax. If you can&#8217;t remember exactly how to do something in your chosen language, make something up and just explain to your interviewer that you would look up the specifics in the documentation. Likewise, if you utilize an abstraction or programming idiom which admits a trivial implementation, don&#8217;t be afraid to just write out the interface and omit the implementation so you can concentrate on more important aspects of the problem (e.g., &#8220;I&#8217;m going to use a circular buffer here with the following interface without writing out the full implementation&#8221;).</li>
</ul>
<h2>Once you have a solution</h2>
<ul>
<li><strong>Think about edge cases.</strong> Naturally, you should strive for a solution that&#8217;s correct in all observable aspects. Sometimes there will be a flaw in the core logic of your solution, but more often your only bugs will be in how you handle edge cases. (This is true of real-world engineering as well.) Make sure your solution works on all edge cases you can think of. One way you can search for edge-case bugs is to&#8230;</li>
<li><strong>Step through your code.</strong> One of the best ways to check your work is to simulate how your code executes against a sample input. Take one of your earlier examples and make sure your code produces the right result. Huge caveat here: when mentally simulating how your code behaves, your brain will be tempted to project what it wants to happen rather than what actually says happen. Fight this tendency by being as literal as possible. For example, if you&#8217;re calculating a string index with code like <code>str.length()-suffix.length()</code>, don&#8217;t just assume you know where that index will land; actually do the math and make sure the value is what you were hoping for.</li>
<li><strong>Explain the shortcuts you took.</strong> If you skipped things for reasons of expedience that you would otherwise do in a &#8220;real world&#8221; scenario, please let us know what you did and why. For example, &#8220;If I were writing this for production use, I would check an invariant here.&#8221; Since whiteboard coding is an artificial environment, this gives us a sense for how you&#8217;ll treat code once you&#8217;re actually on the job.</li>
</ul>
<p>As an addendum, here are a few suggestions for books we like about the art of software construction:</p>
<p><em><a href='http://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882'>Clean Code: A Handbook of Agile Software Craftsmanship</a></em> &#8211; Robert C. Martin<br />
<em><a href="http://www.cc2e.com/">Code Complete: A Practical Handbook of Software Construction</a></em> &#8211; Steve McConnell<br />
<em><a href="http://cm.bell-labs.com/cm/cs/tpop/">The Practice of Programming</a></em> &#8211; Brian Kernighan, Rob Pike<br />
<em><a href="https://secure.wikimedia.org/wikipedia/en/wiki/Design_Patterns">Design Patterns: Elements of Reusable Object-Oriented Software</a></em> &#8211; Erich Gamma, et al.<br />
<em><a href="http://java.sun.com/docs/books/effective/">Effective Java</a></em> &#8211; Joshua Bloch</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/10/03/the-coding-interview/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Tech Talk: the Hedgehog Programming Language</title>
		<link>http://blog.palantirtech.com/2011/06/06/tech-talk-the-hedgehog-programming-language/</link>
		<comments>http://blog.palantirtech.com/2011/06/06/tech-talk-the-hedgehog-programming-language/#comments</comments>
		<pubDate>Mon, 06 Jun 2011 20:53:38 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[distributed systems]]></category>
		<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[user interface]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1844</guid>
		<description><![CDATA[A few months back, Kevin introduced us to the Hedgehog Programming language &#8211; (here&#8217;s the post if you missed it). The Palantir Finance programming language — Hedgehog as we know it — is an interpreted, statically typed, object-oriented language. With a syntax that’s based loosely on Java, it mixes roughly Java-style semantics and a few [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; width: 300px; margin-bottom: 15px; margin-left: 15px'><a target='new' href='http://www.pfinance.com/'><img src="http://blog.palantir.com/wp-content/uploads/2010/10/hedgehog.jpg" alt="" title="hedgehog" width="300" height="129" class="alignnone size-medium wp-image-1753" /></a></div>
<p>A few months back, Kevin introduced us to the Hedgehog Programming language &#8211; <a href="http://www.youtube.com/watch?v=54Vv3Os3Ep4">(here&#8217;s the post if you missed it)</a>.  </p>
<p>The Palantir Finance programming language — Hedgehog as we know it — is an interpreted, statically typed, object-oriented language. With a syntax that’s based loosely on Java, it mixes roughly Java-style semantics and a few idiosyncrasies that make it a really interesting case study in language design. It’s built to be extremely efficient for batch operations on time series, which is the heavy lifting in financial analysis.</p>
<p>In this video, Eugene and Dave, two of the engineers that work on the language and platform features needed to support it, give a talk that goes into a number of areas around the Hedgehog language, including why we needed to build a language, how it makes the platform more powerful, how we built dev tools into the UI to make debugging easier, and a bunch of the nitty-gritty features that go into the strange (but fitting) beast that is the Hedgehog Language.</p>
<p><iframe title="YouTube video player" width="640" height="510" src="http://www.youtube.com/embed/54Vv3Os3Ep4" frameborder="0" allowfullscreen></iframe></p>
<p>As a final note: this is one of things that I love about working at Palantir Technologies.  We study a problem pretty hard before we decide that we need to re-invent the wheel &#8211; and then when we do, we go all out.  It&#8217;s one of the benefits of working with the incredibly talented and motivated folks here.  When someone says, &#8220;well, we need to build a programming language.  No, we&#8217;re sure,&#8221; we just roll up our sleeves and do it.  We can add it to the list of: <a href="http://blog.palantir.com/2009/02/23/palantir-monitoring-server-where-build-beats-buy/">JMX monitoring system</a>, <a href="http://blog.palantir.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/">refined Lucene search engine</a>, <a href="http://blog.palantir.com/2011/04/15/inside-horizon-interactive-analysis-at-cloud-scale/">speeding up Map-Reduce-like systems to interactive time</a>, and <a href="http://www.palantirtech.com/government/analysis-blog/isr">implementing our own GIS platform</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/06/06/tech-talk-the-hedgehog-programming-language/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Decorator Pattern: Implementing decorators using forwarding classes</title>
		<link>http://blog.palantirtech.com/2011/03/01/decorator-pattern-implementing-decorators-using-forwarding-classes/</link>
		<comments>http://blog.palantirtech.com/2011/03/01/decorator-pattern-implementing-decorators-using-forwarding-classes/#comments</comments>
		<pubDate>Tue, 01 Mar 2011 21:29:18 +0000</pubDate>
		<dc:creator>Allen Chang</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[tips and tricks]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1821</guid>
		<description><![CDATA[A forwarding class is an abstract base class which makes it easier to implement decorators for a particular interface. A forwarding class simply forwards all calls it receives to some delegate; a decorator can then be implemented by extending the forwarding class and overriding the relevant methods. Here&#8217;s an example implementation of a forwarding class [...]]]></description>
			<content:encoded><![CDATA[<p>A forwarding class is an abstract base class which makes it easier to implement decorators for a particular interface. A forwarding class simply forwards all calls it receives to some delegate; a <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Decorator_pattern">decorator</a> can then be implemented by extending the forwarding class and overriding the relevant methods. Here&#8217;s an example implementation of a forwarding class for the <code>Collection&lt;E&gt;</code> interface (from <a href="https://code.google.com/p/google-collections/">Google Collections</a>):</p>
<pre class="brush: java; title: ; notranslate">
public abstract class ForwardingCollection&lt;E&gt; extends ForwardingObject implements Collection&lt;E&gt; {
  @Override protected abstract Collection&lt;E&gt; delegate();

  public boolean add(E element) {
    return delegate().add(element);
  }

  // ... (more overridden methods) ...
}
</pre>
<p>Things worth noting:</p>
<ul>
<li>This class overrides the <code>delegate()</code> method to return a <i>more specific</i> type.</li>
<li>The class contains no instance variables and <i>does not</i> have to be marked <a href="http://download.oracle.com/javase/1.5.0/docs/api/java/io/Serializable.html">Serializable</a>.</li>
</ul>
<p>Now, here&#8217;s an example of how to implement a decorator using a forwarding class (also from Google Collections):</p>
<pre class="brush: java; title: ; notranslate">
  public class ConstrainedCollection&lt;E&gt; extends ForwardingCollection&lt;E&gt; {
    private final Collection&lt;E&gt; delegate;
    private final Constraint&lt;? super E&gt; constraint;

    public ConstrainedCollection(Collection&lt;E&gt; delegate, Constraint constraint) {
      this.delegate = checkNotNull(delegate);
      this.constraint = checkNotNull(constraint);
    }
    @Override protected Collection&lt;E&gt; delegate() {
      return delegate;
    }
    @Override public boolean add(E element) {
      constraint.checkElement(element);
      return super.add(element);
    }
    @Override public boolean addAll(Collection&lt;? extends E&gt; elements) {
      return super.addAll(checkElements(elements, constraint));
    }
  }
</pre>
<p>This class implements a collection that checks a constraint on all elements added to the collection. Note that writing the decorator is easy given the <a href="https://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ForwardingObject.html">forwarding class</a> &#8212; only the methods relevant to the decorator need to be overridden.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/03/01/decorator-pattern-implementing-decorators-using-forwarding-classes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Hedgehog Programming Language</title>
		<link>http://blog.palantirtech.com/2011/02/02/hhlang/</link>
		<comments>http://blog.palantirtech.com/2011/02/02/hhlang/#comments</comments>
		<pubDate>Wed, 02 Feb 2011 07:56:49 +0000</pubDate>
		<dc:creator>Kevin Simler</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantirtech]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[user interface]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1759</guid>
		<description><![CDATA[One thing about being a developer on the Palantir Finance product that doesn&#8217;t get nearly enough publicity is the fact that we have our own programming language. I&#8217;m pretty excited about it so let me repeat, with emphasis: we have our own programming language. Yeah, it&#8217;s awesome. All those late hours you spent in the [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; width: 300px; margin-bottom: 15px; margin-left: 15px'><a target='new' href='http://www.pfinance.com/'><img src="http://blog.palantir.com/wp-content/uploads/2010/10/hedgehog.jpg" alt="" title="hedgehog" width="300" height="129" class="alignnone size-medium wp-image-1753" /></a></div>
<p>One thing about being a developer on the Palantir Finance product that doesn&#8217;t get nearly enough publicity is the fact that we have our own programming language.  I&#8217;m pretty excited about it so let me repeat, with emphasis:  <em><strong>we have our own programming language</strong></em>.  Yeah, it&#8217;s awesome.  All those late hours you spent in the lab working on your final project in compilers:  turns out they&#8217;re actually good for something other than getting into grad school.</p>
<p>Building this language ourselves &#8212; as opposed to, say, using an existing language that already just works &#8212; wasn&#8217;t an easy decision.  In fact, it wasn&#8217;t even a single decision.  We wracked our collective brain dozens of times trying to think of a better approach.  But every which way we sliced it, the problems we needed to solve always pointed to building our own language.  I still question this decision sometimes, but on the whole I&#8217;m very happy with how things have turned out.</p>
<p><span id="more-1759"></span></p>
<p>The Palantir Finance programming language &#8212; Hedgehog as we know it &#8212; is an interpreted, statically typed, object-oriented language. With a syntax that&#8217;s based loosely on Java, it mixes roughly Java-style semantics and a few idiosyncrasies that make it a really interesting case study in language design.  It&#8217;s built to be extremely efficient for batch operations on time series, which is the heavy lifting in financial analysis.  It also allows you to dynamically add methods to a class from outside the class itself (conceptually similar to <a href="http://juixe.com/techknow/index.php/2006/06/15/mixins-in-ruby/">Ruby&#8217;s Mixins</a>) &mdash; you define the function and its input type, and when you type the dot operator, your new method is auto-completed alongside all the &#8220;native&#8221; methods.  Hedgehog also has a vast number of effectively global constants: all the stocks, bonds, and other financial instruments that are essential to the user experience, but that make for quite a design challenge.</p>
<p>I&#8217;m not a language guy myself, so instead of continuing to geek out over the core language features, I want to geek out about an emergent property that&#8217;s truly unique to the Hedgehog language.  But first I&#8217;m going to back up and talk about something else that&#8217;s really important to us at Palantir:  user experience. (I&#8217;ll get back to languages I promise.)</p>
<p>There&#8217;s a UX principle that says your interface should be &#8220;low threshold, high ceiling&#8221;. That is, it should be easy for the user to get started, but also able to do powerful things.  This is actually a corollary of a more general principle:  that your interface should strive for the <strong>optimal learning curve</strong>.  My first CS professor explained this with a set of three diagrams, each representing one of the major OS families.  I don&#8217;t remember exactly how he drew these diagrams at the time, but an updated version of them might look like this:</p>
<div style='text-align: center'><img style='margin-auto' src="http://blog.palantir.com/wp-content/uploads/2010/10/learning_curves.png" alt="" title="learning_curves" class="alignnone size-medium wp-image-1748" /></div>
<p>The x-axis of each curve represents &#8220;wizardry,&#8221; a measure of the user&#8217;s technical sophistication.  The y-axis represents the power of the system &#8212; how much the user can accomplish at a given level of wizardry.</p>
<p>The best of the three curves, my prof argued, was the third curve.  The first learning curve is great for providing incentives to learn.  Each unit of effort spent to increase your wizardry yields an appropriate amount of reward or power.  The drawback is that it&#8217;s hard for new users to do anything useful; its reward threshold is too high.  The middle curve has a lower threshold and is better for novice users, but will frustrate an intermediate user because of the great plateau in the middle.  (This might represent a place where the GUI isn&#8217;t powerful enough for the tasks you want to accomplish but scripting is still too difficult, leaving no way to express your commands)  The third curve, however, is the best of both worlds:  a low threshold and a smooth trajectory to the top.</p>
<p>Now let&#8217;s apply this back to our topic at hand, programming languages.  Specifically, what does the learning curve look like for learning a first language?  (Once you&#8217;ve learned one, of course, the rest come pretty easily.)</p>
<div style='float: right; width: 469px; margin-bottom: 15px; margin-left: 15px'><img src="http://blog.palantir.com/wp-content/uploads/2010/10/learning_curves_languages.png" alt="" title="learning_curves_languages" class="alignnone size-medium wp-image-1749" /></div>
<p>If your experience of learning to program was anything like mine, the first few projects in your first language were <em>painful</em>.  You could sense the power further up the curve &#8212; it&#8217;s what convinced you to stick with CS &#8212; but simple tasks took a lot more effort than they should have, at the beginning.</p>
<p>Hedgehog on the other hand &#8212; our little homebrew that will someday have its own Wikipedia page &#8212; has the smoothest learning curve I&#8217;ve ever seen in a programming language.  That&#8217;s the emergent property I wanted to talk about, because it&#8217;s a thing of beauty.  You can get started with Hedgehog right away and accomplish quite a bit &mdash; without even knowing that you&#8217;re &#8220;programming&#8221; and the slope on the curve stays relatively constant throughout your trajectory.</p>
<p>We didn&#8217;t realize it at the time, but we were probably destined to create a low-threshold, high-ceiling language with a smooth learning curve, due to the nature of our user base.  Financial analysts are impatient, and they still need to perform many kinds of complicated analysis.  They definitely don&#8217;t have the time or inclination to spend a semester learning how to program.  The solution to their problem is Hedgehog.</p>
<p>Allow me to illustrate with one of the earliest things a user might type into the expression bar:</p>
<div style='text-align: center'><img style='margin-auto' src="http://blog.palantir.com/wp-content/uploads/2010/10/ibm.png" alt="" title="ibm" class="alignnone size-medium wp-image-1746" /></div>
<p>And that&#8217;s it.  The user types a ticker symbol and he gets a chart of IBM&#8217;s stock price.  At no point did he have to wonder about variables or types or #includes.  This experience is so <a href="http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/">frictionless</a> he probably doesn&#8217;t even realize he&#8217;s writing code in a programming language.  He just starts with what he knows, and the system gives him what he wants.</p>
<p>It starts to get interesting as you move further up the curve.  Take this user input:</p>
<div style='text-align: center'><img style='margin-auto' src="http://blog.palantir.com/wp-content/uploads/2010/10/ibm_volume.png" alt="" title="ibm_volume" class="alignnone size-medium wp-image-1747" /></div>
<p>Of course that innocent dot between &#8220;IBM&#8221; and &#8220;volume&#8221; means a method invocation to anyone who&#8217;s familiar with C++ or Java.  But to a new Palantir Finance user it simply means, &#8220;Let me access all the types of data associated with IBM.&#8221;  Conceptually painless.</p>
<p>Or how about this one?</p>
<div style='text-align: center'><img style='margin-auto' src="http://blog.palantir.com/wp-content/uploads/2010/10/histogram.png" alt="" title="histogram" class="alignnone size-medium wp-image-1745" /></div>
<p>The <code>volume/1000</code> expression is an anonymous method acting in the scope of a Stock object; it&#8217;s syntactic sugar for <code>return this.volume()/1000;</code>.  But by allowing the user to strip away all the unnecessary syntax, we make learning the language that much easier.</p>
<p>I could go on tracing the curve here (I&#8217;ve only scratched the surface), but I hope I&#8217;ve made my point: we coax new users into writing code by making it look as much as possible like performing operations that they already intuitively understand.  This is one of the benefits of creating a domain-specific language &mdash; we got the richness of the domain for free, and all the understanding that comes with it &mdash; and then we went above and beyond the simplification of a traditional <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Domain-specific_language">DSL</a> to really pare down the complexity of the language for novice users.</p>
<p>From simple beginnings like the ones I&#8217;ve shown here, it doesn&#8217;t take our users long at all to cross the threshold to more intermediate-level work, such as chaining function calls together or creating their own methods.  As far as the high ceiling goes, we&#8217;re still working on it, but the language is currently capable of producing not only a <a href="http://en.wikipedia.org/wiki/Quine_(computing)">quine</a>, as one of our candidates showed us (yes, we ended up hiring him), but also code that can generate studies like the one below:</p>
<p><a href="http://blog.palantir.com/wp-content/uploads/2010/10/dashboard.png"><img src="http://blog.palantir.com/wp-content/uploads/2010/10/dashboard.png" alt="" title="dashboard" width="100%" class="alignnone size-medium wp-image-1744" /></a></p>
<p>So Hedgehog has a low threshold and a smooth learning curve, and the ceiling is high enough that our users can do some really serious information processing with it &#8212; tasks that would make their other tools break down and cry.  But there&#8217;s still a lot of interesting work for us to do, especially in pushing the language&#8217;s ceiling higher (developing better interactive debugging; working with large objects efficiently) &mdash; and as always, making it <em>faster</em>.</p>
<p><em>If you&#8217;d like to see the Hedgehog Programming Language in action, you can sign up for an account at <a href='http://joyride.pfinance.com/'>Palantir JoyRide</a>. the <a href="http://www.pfinance.com/">Palantir Finance</a> public demo.</a></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/02/02/hhlang/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>A rigorous friction model for human-computer symbiosis</title>
		<link>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/</link>
		<comments>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 03:18:52 +0000</pubDate>
		<dc:creator>Asher Sinensky</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[javatech]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[softwarephilosophy]]></category>
		<category><![CDATA[user interface]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1344</guid>
		<description><![CDATA[This is a response to Ari&#8217;s awesome post on human-computer symbiosis. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible&#8230; let&#8217;s take a look: We are attempting to understand the total analytic capability for a given task a of a human-computer [...]]]></description>
			<content:encoded><![CDATA[<div style='text-align: center; float: right; margin-left: 15px; margin-right: 15px'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/graph.png" alt="" width="300"/>
</div>
<p>This is a response to <a href="http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/">Ari&#8217;s awesome post on human-computer symbiosis</a>. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible&#8230; let&#8217;s take a look:</p>
<p>We are attempting to understand the total analytic capability for a given task <strong><em>a</em></strong> of a human-computer team. Analytic capability in this case probably means:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq1.png" alt="eq1"/>(1)
</div>
<p>Where <strong><em>A</em></strong> is the answer to the analytic problem in question and <strong><em>t<sub>A</sub></em></strong> is the time needed to arrive at the answer based on the inputs available. In the case of chess, <strong><em>A</em></strong> could be the optimum next move given all previous information and <strong><em>t<sub>A</sub></em></strong> would be how long it takes to decide on this move.</p>
<p>Read on for a look at how this generalizes in human-computer symbiotic systems.<br />
<span id="more-1344"></span></p>
<p>In the case of the human-computer team, we know that <strong><em>a </em></strong>is going to be a function of both the human&#8217;s analytical capability <strong><em>h</em></strong> and the computer&#8217;s analytical capability <strong><em>c</em></strong> (where both <strong><em>h</em></strong> and <strong><em>c</em></strong> have units of answers/time). In the limit case we know that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq2.png" alt="eq2"/>(2)
</div>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq3.png" alt="eq3"/>(3)
</div>
<p>Or in plain English, if there is no human present, the total analytic capability is simply the analytic capability of the computer. So the naïve solution would be that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq4.png" alt="eq4"/>(4)
</div>
<p>(4) clearly meets the limiting cases described in (2) and (3). Kasparov noticed a mixing function where the ability of the human and computer to work together becomes the dominant term &mdash; we might call this the mixing capability for the given task or <strong><em>m</em></strong>. Including this phenomenon, the total analytic capability (4) would be re-defined as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq5.png" alt="eq5"/>(5)
</div>
<p>where <strong><em>m</em></strong> has the property that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq6.png" alt="eq6"/>(6)
</div>
<p>Thus maintaining the limits expressed in (2) and (3) and adhering to the observation that if there is no human or computer component then there will be no mixing advantage. A naïve solution to this constraint would be simple linear mixing:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq7.png" alt="eq7"/>  (7)
</div>
<p>where <strong><em>M</em></strong> (units of time per answer) is the mixing efficiency and will be primarily based on the type of task being solved &mdash; some analytical tasks lend themselves to a combined process more than others (for example, multiplying 20 digit numbers does not really benefit from the intuition of a human so the ability of a human and computer to perform this task is merely their additive ability). </p>
<p>What Kasparov noticed is that the mixing was primarily based on the quality of the process rather than the analytical power of either the human or computer separately. This seems to imply that we must somehow account for the fact that the quality of the human-computer interface is responsible for the quality of the mixing. This can be modeled as a unitless friction of interaction <strong><em>f<sub>i</sub></em></strong> that impedes the ability of the human and computer to work together. </p>
<p>Equation (7) can thus be re-written as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq8.png" alt="eq8"/>(8)
</div>
<p>In this case, the maximum value for the mixing capability is realized when the friction of interaction goes to zero. This mixing capability is the same as the equation Ari developed (less the coefficient which is necessary to maintain consistent units throughout).</p>
<p>We can now re-write our analytic capability in (5) as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq9.png" alt="eq9"/>(9)
</div>
<p>Below, see a plot of this function over a range of values for <strong><em>h</em></strong>, <strong><em>c</em></strong> and <strong><em>f<sub>i</sub></em></strong>:</p>
<div style='text-align: center; margin: auto; margin-bottom: 1em;'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/graph.png" alt=""/>
</div>
<p>As can clearly be seen from this functional plot (note the vertical scale), the effect of interface friction dominates over the other terms whenever both the human and computer can make important contributions to the task at hand. The conclusion can be drawn that the most effective way to solve analytical problems is to minimize the friction of the human-computer interface; or to put it another way: optimal analytical systems are those that are built specifically to maximize the ability of the human to leverage the ability of the computer.</p>
<p>I am certain there is still the possibility for further refinement, for example:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq10a.png" alt="eq10a"/>(10)
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Fun with jMock</title>
		<link>http://blog.palantirtech.com/2009/11/22/fun-with-jmock/</link>
		<comments>http://blog.palantirtech.com/2009/11/22/fun-with-jmock/#comments</comments>
		<pubDate>Sun, 22 Nov 2009 21:15:08 +0000</pubDate>
		<dc:creator>Steve Downing</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[development process]]></category>
		<category><![CDATA[javatech]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[tips and tricks]]></category>
		<category><![CDATA[unit testing]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1274</guid>
		<description><![CDATA[Here at Palantir, a lot of our automatic tests are full-chain tests. A backend server is fired up, client code runs against it, and everything runs much like a production environment. This makes intuitive sense because it’s a faithful approximation of how the system will run in the field. However, there are some disadvantages to [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; width: 175px;'><a href='http://www.jmock.org/'><img src='http://www.jmock.org/logo.png' style='background-color: #000066; padding: 10px'/></a></div>
<p>Here at Palantir, a lot of our automatic tests are full-chain tests. A backend server is fired up, client code runs against it, and everything runs much like a production environment. This makes intuitive sense because it’s a faithful approximation of how the system will run in the field.</p>
<p>However, there are some disadvantages to this:</p>
<ul>
<li>Full-pass tests don’t always localize the problem. Tests on a client class might fail even if it was the service that behaved incorrectly.
</li>
<li>These full-pass tests are relatively slow. Client code is running against an actual remote service. If a client is being tested, the server code still has to do work — sometimes a lot of work — even if that isn’t the focus of the test.</li>
<li>The constraints of the test are loose. Full-chain tests can mostly only see whether the operation finished correctly. It’s much harder to figure out whether the operation was done efficiently and without making unnecessary service calls.</li>
<li>They’re very little setup flexibility. If you want an RPC to return a specific value, you have little choice but to have your test get the service into a state where it can return that value. This is easy in some cases, but prohibitively difficult in others.</li>
<li>Client tests are forced to share any non-determinism leaked from the service. For example, under real conditions, a request to call A might respond before call B, and sometimes the other way around. This can result in flaky tests or tests that don’t always simulate the conditions you want to exercise.</li>
</ul>
<p>What’s to be done? Fortunately, there’s an option that handles these cases elegantly. We also test with <a href="http://www.jmock.org/">jMock</a>, a library that dynamically generates mock objects from arbitrary interfaces. These mock objects can be configured to check that particular methods are called with particular inputs a particular number of times, and then give prescribed responses.</p>
<p>Hit the link to see a concrete example of jMock in action.<br />
<span id="more-1274"></span></p>
<h2>jMock in action</h2>
<p>Let&#8217;s say I want to test my object viewer page in Palantir Web, but I don’t want to fire up a dispatch server at all. First, I create my mock service object.</p>
<pre class="brush: java; title: ; notranslate">
Mockery context = new Mockery();
final PalantirService service = context.mock(PalantirService.class);
</pre>
<p>Then, I set the expectations of my mock object. In this case, I want to tell my mock object to expect a call to PalantirService.getObject() and PalantirService.getDataSources(). getObject() will return a specific object. Any call made to the service apart from these will make the test fail.</p>
<pre class="brush: java; title: ; notranslate">
context.checking(new Expectations() {{
        oneOf(service).getObject(realm.getId(), myObject.getId());
        will(returnValue(myObject));
        oneOf(service).getDataSources(myObject.getDataSources());
}});
</pre>
<p>Now, I create the object I want to test and inject the service.</p>
<pre class="brush: java; title: ; notranslate">
ObjectViewController controller = new ObjectViewController();
controller.setService(service);
</pre>
<p>And then we fire away.</p>
<pre class="brush: java; title: ; notranslate">
ModelMap model = new ModelMap();
controller.doGet(myObject.getId(), model);
</pre>
<p>Now that the controller (the class we’re exercising) has gone off and populated the model, we check to see that the model is populated correctly. Just like we would in any other test.</p>
<pre class="brush: java; title: ; notranslate">
assertEquals(myObject.getName(), model.get(&quot;objectName&quot;));
assertEquals(myObject, model.get(&quot;object&quot;));
</pre>
<p>But in addition, we also assert that the expectations specified above were satisfied.</p>
<pre class="brush: java; title: ; notranslate">
context.assertIsSatisfied();
</pre>
<p>Not only can we be sure that the right calls were made with the right parameters, but we can also be sure that no calls besides the expected calls were made. So the next time you want more speed or control over your tests, take a look at jMock or another framework like it. It’s a powerful tool in the effort to test your best!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/11/22/fun-with-jmock/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Palantir: search with a twist (part two: realtime indexing and security)</title>
		<link>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/</link>
		<comments>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 07:01:01 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1260</guid>
		<description><![CDATA[[A number of weeks ago, we published a post on the search technology used by Palantir. That post covered raising the memory efficiency of a couple of operations. This is part two of that series.] The most familiar use of search engines is to index documents made available on the Internet via the hypertext transfer [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px'><img src='/wp-content/uploads/2009/08/200px-magnifying_glass_icon.png' alt='magnifying glass'/></div>
<p><em>[A number of weeks ago, we published <a href="http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/">a post on the search technology</a> used by Palantir.  That post covered raising the memory efficiency of a couple of operations.  This is part two of that series.]</em></p>
<p>The most familiar use of search engines is to index documents made available on the Internet via the <a href="http://www.ietf.org/rfc/rfc2616.txt">hypertext transfer protocol</a>. Forgotten names like <a href="http://en.wikipedia.org/wiki/AltaVista">AltaVista</a>, names not-yet-really-learned like <a href="http://web.archive.org/web/20040828134017/http://www.bing.com/">Bing</a>, and, of course, <a href="http://infolab.stanford.edu/~backrub/google.html">Google</a> come to mind.</p>
<p>This one, massive use case has a couple of properties that I&#8217;d like to highlight:</p>
<ul>
<li>Asynchronous indexing and querying &#8211; web search engines tend to use crawlers and indexers to build up an index of the web.  After each crawl is finished, the new index is brought online for use by the query engine.</li>
<li>Lack of access controls &#8211; all the data in the index is available to any query.  In fact, most queries are (from the standpoint of the index) completely anonymous.</li>
</ul>
<h3>Palantir: not a web search engine</h3>
<p>Search technology is just one part of what makes up a Palantir system.  For us, it&#8217;s a way to quickly retrieve Palantir objects in a Palantir system, it&#8217;s not the whole of the application.</p>
<p>I&#8217;d like to highlight a couple of differences from the <a href="http://en.wikipedia.org/wiki/Web_search_engine">web search engine</a> case.  A Palantir system needs the following properties:</p>
<ul>
<li>Realtime indexing and querying &#8211; we need information to be available immediately as it changes in the system.</li>
<li>Leak-proof access controls &#8211; we need the search engine to help us make sure that we don&#8217;t have information leaking across access control boundaries.</li>
</ul>
<p>Hit the link to read more about these topics.<br />
<span id="more-1260"></span></p>
<h2>Realtime indexing</h2>
<p>The Palantir platforms implement realtime indexing: as soon as an analyst changes an object in the system, it needs to be available to query. This could be a change to data in the object or a change to the security tags on the object.</p>
<p>From a programming perspective, this is pretty straightforward: a Palantir transaction will not commit until the search engine is finished indexing the new data.</p>
<p>From a search engine operational perspective, this induces some challenges.  Asynchronous indexing allows the search engines to bring online a highly optimized static form of the index.  Contrast that with realtime indexing, where every cycle spent optimizing the index is removing cycles from serving other queries and there is likely a human waiting for the optimizing process to finish.</p>
<p>When using the static index, a query only accesses one, optimized index file which then points to the documents containing the results.  However, as changes and additions are indexed into the system, there is a lot of overhead to merging them into the master index.</p>
<p>Instead of merging and optimizing on every change, Lucene can keep around a number of smaller indexes that hold all the fresh entries.  These are fixed-size append-only segments that are much cheaper to write to than the optimized and merged form of the index. So basically, these &#8216;dynamic&#8217; indexes are linear lists of single-document indexes.  When the search engine goes to run a query it has to follow this simple (yet expensive) algorithm:</p>
<ul>
<li>Query the static, merged index, accumulating results. <i>(this part is reasonably fast)</i></li>
<li>For each of the dynamic indexes:
<ul>
<li>Open the file, incurring IO overhead.</li>
<li>Query each single-document index and look for additional records or newer records that supersede one of the existing found results.</li>
</ul>
</li>
</ul>
<p>You can see how the overhead of this can quickly get pretty large as the number of dynamic indexes grows: it grows linearly with number of new indexed records.  Compare that with the optimized index, which should be close to constant time for any given query.</p>
<p>To get around this, the indexer will only allow a certain number of these dynamic indexes to accumulate before it kicks off a background merge job.  During the merge job, we take a noticeable performance hit, but by batching up the merge run we amortize the overhead away for an overall performance win.  This hybrid mode didn&#8217;t require us to write any new code, but just to tune Lucene to give us the performance profile we wanted.</p>
<h2>Preventing Information Leaks</h2>
<p>The Palantir data platform has a fairly sophisticated security model baked in (see <em><a href="http://www.palantirtech.com/government/videos/whitevideos">The White Videos</em></a> for a more in-depth look at the security model).  One of the features that we have implemented is the ability to show a narrower view of an object based the user&#8217;s permissions: the user only sees the slice of the data that they have been granted access to.  Part of the complexity in implementing this was that we can&#8217;t even hint that the other, hidden data exists at all.</p>
<p>Search engines ranks their results by relevance, showing the matches to the query that it believes to be most relevant first.  One common way to make these relevance calculations is by comparing the length of the search term or phrase to the length of the term that it matched.  Consider the search term &#8216;king&#8217;: it will match the following phrases:</p>
<ul>
<li>&#8220;I&#8217;m the king of the world!&#8221;</li>
<li>&#8220;King salmon are often found in the Pacific Northwest and are also known as Chinook salmon.&#8221;</li>
<li>&#8220;Yes, my king.&#8221;</li>
</ul>
<p>Using a length-computed relevance, the phrase, &#8220;Yes, my king.&#8221; is the most relevant.</p>
<p>Getting back to the Palantir object model: for each distinct set of permissions that an object has, we compute a different object label based on the properties that are visible to that particular slice.  These multiple titles all go into the search engine.  If we were to compute relevance based on the length of the phrase that matched, and the shortest match on the object is shorter than the match that is actually visible to us, we could return the object with a higher-than-obvious relevance.  If we were to do that, we&#8217;d be leaking information, namely that there&#8217;s data on this object that the user making the query is not privy to. (Note that filtering of objects that aren&#8217;t at all visible to the user is done in a higher layer  after the results have been accumulated and ranked by the search engine.)</p>
<p>Given this problem, there are two approaches one can take:</p>
<ol>
<li>Store all the information needed to decide which labels are visible to the user running the query and then use only the visible labels when calculating the relevance of a match. Note that is a pretty expensive operation.</li>
<li>Don&#8217;t use the length of match to compute relevance. Note that skipping a relevance calculation is, obviously, a very cheap thing do.</li>
</ol>
<p>Which do we do?  Both.</p>
<p>When matching against object labels, the length metric actually lets us discern between better and worse matches. So in that case, we incur the cost of this calculation in order to return higher quality results.</p>
<p>However, when matching against things like document bodies, the ratio of the size of the match to the size of the search term starts to have less meaning but still has the possibility of leaking information in the query results.  For fields like this, we turn off the relevance calculations based on length of match. The upshot is the we don&#8217;t have to store the permissions information in the index nor incur the cost of the permissions/views calculation for these fields.</p>
<h2>A heartfelt thank you</h2>
<p>To be clear, this post highlights the ways in which our search code diverges from the main <a href="http://lucene.apache.org/java/docs/">Lucene</a> code base.  We&#8217;re huge fans of Lucene and have great respect for the developers that built and maintain what is probably the world&#8217;s greatest open-source search engine.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Palantir: search with a twist (part one: memory efficiency)</title>
		<link>http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/</link>
		<comments>http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/#comments</comments>
		<pubDate>Fri, 14 Aug 2009 07:53:59 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[software engineering]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1088</guid>
		<description><![CDATA[A Palantir cluster seamlessly integrates many pieces of proven technology. One of them is our customized version of the venerable Java search engine, Lucene. Search engine technology tends to be optimized for the common use case of indexing web documents (or similar information architectures) where you have a few search terms in each query and [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px'><img src='/wp-content/uploads/2009/08/200px-magnifying_glass_icon.png' alt='magnifying glass'/></div>
<p>A Palantir cluster seamlessly integrates many pieces of proven technology.  One of them is our customized version of the venerable Java search engine, <a href="http://lucene.apache.org/java/docs/">Lucene</a>. Search engine technology tends to be optimized for the common use case of indexing web documents (or similar information architectures) where you have a few search terms in each query and many, many documents as results. We want to leverage the <a href="http://en.wikipedia.org/wiki/Inverted_index">inverted index</a> capabilities of Lucene, but our data access patterns are a bit different than the typical use case:  we need things like pervasive range-querying, different types of relevance, and dynamic views of the data based on security constraints. So in building our data platform, we&#8217;ve run into some interesting challenges that are pretty unique in the information retrieval realm, specifically:</p>
<ol>
<li>Raising memory efficiency</li>
<li>Real-time indexing</li>
<li>Preventing information leaks across access boundaries in an efficient manner</li>
</ol>
<p>I&#8217;ll cover (1) in this post and (2) and (3) in a <a href="https://wp-admin-techblog.yojoe.local/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/">later post</a>, due out in about two weeks. <i>(Note: part 2 is available <a href="https://wp-admin-techblog.yojoe.local/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/">here</a>)</i></p>
<p>Hit the link and we&#8217;ll delve into this topic.<br />
<span id="more-1088"></span></p>
<h2>Raising memory efficiency</h2>
<p>We&#8217;ve addressed the issue of resource constraints, generally, in our earlier post: <a href="http://blog.palantirtech.com/2009/05/22/bandwidth-isnt-cheap-disk-isnt-cheap-cpu-isnt-cheap/"><em>Bandwidth isn’t cheap. Disk isn’t cheap. CPU isn’t cheap.</em></a> In that post, we posited &#8220;RAM to the rescue&#8221;:</p>
<blockquote><p>
On the other hand, some things in a SCIF are comparatively cheap. We never use boxes with less than 32GB of memory, and, in fact, lots of sites use 128GB of memory. RAM requires negligible power and cooling, and compared to disk, it’s relatively simple to install. It’s also easy to reconfigure the setup to use the additional memory.</p></blockquote>
<p>While this is true, no matter how much RAM you buy, your users will find a way to use it all &#8212; search is no exception.  In many of our environments, the search processes share hardware with other processes in the Palantir cluster, so while the OS may have 128 GB of RAM available, the search process&#8217;s VM has substantially less available to it. Compare this to a cluster of dedicated search nodes, where each node will have indexes sized to fit specifically into the memory available.</p>
<p>The upshot is that we needed to modify parts of <a href="http://lucene.apache.org/java/docs/index.html">Lucene</a> to deal with tighter memory constraints than it was designed for.</p>
<h3>Priority queue results accumulation</h3>
<p>Most systems that implement search include some notion of paging through the results.  We use a multi-level paging system, with the search server maintaining a server-side page for each query and serving smaller client-facing pages from.</p>
<p>Vanilla Lucene uses the following algorithm for accumulating search results:</p>
<ol>
<li>Load all matching results.</li>
<li>Sort by some relevance metric(s).</li>
<li>Return the top <i>n</i> results.</li>
</ol>
<p>The results are cached as a server-side page in case the client wants to load more than the first <em>n</em> results. You can see where this could run into trouble: if the total number of matching documents is high, that&#8217;s a lot of wasted RAM while we winnow it down to the size of the server page. So we use the following algorithm:</p>
<ol>
<li>Construct a <a href="http://en.wikipedia.org/wiki/Priority_queue">priority queue</a> of constrained size with priority computed using the chosen relevance metric</li>
<li>Stream through the results, inserting into the queue</li>
<li>Return the set of results in the priority queue</li>
</ol>
<p>Now we never need more RAM than the size of a server-side page to serve results.  The downside is that if the client wants more than one server-side page, we have to run the search &mdash; in its entirety &mdash; twice (ouch). To avoid the first set of results, we adjust the priority queue to kick out all results that were in the first page based on relevance metric.</p>
<h3>Using bitsets to optimize range queries</h3>
<p>A range query can return a result set of very high cardinality &ndash; a range is a very compact way of describing a large set of matching terms (even if they are discrete values, like dates).  One way to think about a range query of, say, <em>10 <= age <= 15</em>, is that it expands to <em>age = 10 OR age = 11 OR age = 12 OR age = 13 OR age = 14 OR age = 15</em>.  Rather than treat range queries in any special way, Lucene just does this expansion of the range and runs the query like a normal query.</p>
<div style='float: right; text-align: right; width: 315px; margin-top: 10px; margin-bottom: 10px;'><img src='/wp-content/uploads/2009/08/searchindexes1.png'/></div>
<p>Internally, Lucene stores a list of metadata nodes, ordered by document id, of each document that matches a given term.  The algorithm goes something like this:</p>
<ol>
<li>Open the document id lists for all matching terms</li>
<li>Walk the list pointers for each potential match such that you accumulate all the metadata for a given document.</li>
<li>Pass all this metadata up to the query processor which decides:
<ol>
<li>Does this document match the overall query? (remember that terms can be inverted)</li>
<li>Use term frequency taken from the metadata to calculate the relevance.</li>
</ol>
</ol>
<p>This structure and attendant algorithm has some nice properties:</p>
<ul>
<li>All documents are processed in a set order.</li>
<li>Everything is known about a document all at once.</li>
<li>It terminates in a single linear scan.</li>
</ul>
<p>&#8230; and has one very nasty property:</p>
<ul>
<li>All of the term value buckets that match the range must be open simultaneously.</li>
</ul>
<p>This is not a big deal for most English language queries.  However, for large ranges and the like, there can be thousands or even millions of terms.</p>
<p>The semantics of range queries have an interesting feature: a document that matches the range twice is not more relevant than one that matches once. (Contrast this with a simple term query: multiple matches <b>do</b> indicate higher relevance). Being able to discard the accounting of how many time we match the range leads to a huge win:</p>
<ol>
<li>We only need a single bit to represent a match</li>
<li>We can process a single term value bucket at a time instead of holding all buckets open in memory.</li>
</ol>
<p>Our search engine accumulates range queries into bitset objects, allowing for a very compact representation of results. We need much less memory than we did before since we only load one term value bucket at a time.  And the algorithm is simpler: no more walking pointers or <em>O(n)</em> check before figuring out which pointer moves next.</p>
<h2>The next episode</h2>
<p>Tune in for <em>Palantir: search with a twist (part two)</em> in a few weeks.  I&#8217;ll cover the following topics:</p>
<ul>
<li>Real-time indexing</li>
<li>Preventing information leaks across access boundaries in an efficient manner. (see Jason&#8217;s <a href='http://www.palantirtech.com/government/analysis-blog/mls'>Multi-Level Security</a> post over on the <a href="http://www.palantirtech.com/government/analysis-blog/">Palantir Government Analysis Blog</a> for a high-level look at why these feature are important. and check out <a href="http://www.palantirtech.com/government/videos/whitevideos">Bob McGrew&#8217;s &#8220;Access Control Model&#8221; White Video</a> for in-depth look at how we apply security to our object model.)
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>JavaInvoke allows you to spawn additional Java VMs during testing</title>
		<link>http://blog.palantirtech.com/2009/07/28/javainvoke/</link>
		<comments>http://blog.palantirtech.com/2009/07/28/javainvoke/#comments</comments>
		<pubDate>Tue, 28 Jul 2009 22:00:30 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise software]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[tips and tricks]]></category>
		<category><![CDATA[unit testing]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=209</guid>
		<description><![CDATA[Here at Palantir we use test-driven development (or TDD for short). Integrated tools like Eclipse and JUnit simplify writing and running unit tests. However, once you need to test a broader swath of functionality, it&#8217;s time to write functional, integration, and system tests. While technically not &#8216;unit testing&#8217;, the testing framework that JUnit provides is [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; text-align: right; width: 298px'><img src="/wp-content/uploads/2009/07/junit.png" alt="junit success" width="288" height="194" /></div>
<p>Here at Palantir we use <a href="http://en.wikipedia.org/wiki/Test-driven_development">test-driven development (or TDD for short)</a>.  Integrated tools like <a href="http://www.eclipse.org/">Eclipse </a>and <a href="http://junit.org/">JUnit</a> simplify <a href="http://open.ncsu.edu/se/tutorials/junit/">writing and running unit tests</a>.  However, once you need to test a broader swath of functionality, it&#8217;s time to write <a href="http://www.ibm.com/developerworks/library/j-test.html#h1">functional</a>, <a href='http://en.wikipedia.org/wiki/Integration_testing'>integration</a>, and <a href='http://en.wikipedia.org/wiki/System_testing'>system</a> tests.  While technically not &#8216;unit testing&#8217;, the testing framework that JUnit provides is basically the same infrastructure that you want to leverage for writing these more involved types of testing.</p>
<p>When you&#8217;re developing enterprise software, functional testing often means getting your clients to talk to your servers.  For the main <a href="http://www.palantirtech.com/government">Palantir Government</a> product, we integrate the process of bringing the server up and down with the Ant scripts that run our automated unit tests: our testing tasks bring up the server, <a href="http://ant.apache.org/manual/OptionalTasks/junit.html">run the test suite</a>, and then kill the server. This works great and produces nice results.</p>
<p>When I started working on our authentication server, the pattern that we had used before didn&#8217;t work for me.  While the Palantir Government tests ran with a single, static configuration file, I needed to run the authentication server with multiple configurations in the course of running through the all the different functional tests.  I determined that I needed a way to programmatically bring the server up and down for testing. In JUnit parlance, I needed a way to programmatically launch the server component as part of my setup() function for my unit tests and stop it in my teardown().</p>
<p>With my itch-to-scratch firmly in hand (or some other mixed metaphor), I set out to figure out how to invoke new Java processes from inside a unit test.  The solution I came up with (with source code and examples) after the jump.<br />
<span id="more-209"></span></p>
<h2>The Six Ingredients</h2>
<p>So there are six ingredients that go into spawning a new VM:</p>
<ul>
<li>The classpath to use for the new VM</li>
<li>The name of the class to run</li>
<li>The directory to be used as the current directory for the process</li>
<li>The command line arguments to pass to the process</li>
<li>The set of Java system properties to use for this process</li>
<li>The environment to pass to the process</li>
</ul>
<p>Let&#8217;s look at each item individually.</p>
<h3>Classpath</h3>
<p>The classpath will tell the spawned VM where to load classes from.  In JavaInvoke, we use the existing classpath (from the spawning VM) as a starting point and then prepend any new entries to allow overriding the classpath for the spawned VM.</p>
<p>This takes a lot of the tedium out of having to figuring out what to put in the classpath.  Most likely, you want something similar to what you already have, if not completely identical.</p>
<p>We get the classpath from <code>System.getProperty("java.class.path")</code> and can add new entries by prepending the new entry, using the value of  <code>File.pathSeparatorChar</code> as the entry delimiter.  Using <code>File.pathSeparatorChar</code> makes the code cross-platform friendly (since the path separator is &#8216;;&#8217; on Windows and &#8216;:&#8217; on Unix (Linux, Solaris, OS/X, etc.).</p>
<p>Caveat: if you change the working directory and your original classpath was constructed using relative paths, you&#8217;ll probably have trouble getting anything to run (since your classpath will no longer point to right locations).</p>
<h3>Class name</h3>
<p>Pretty simple: what do you want to run in the spawned VM?  The class must have a <code>static void main(String args[])</code> defined, and it must be available for loading via the classpath.</p>
<h3>Working Directory</h3>
<p>If it should be different from the current working directory (CWD) of the running process, then set it and JavaInvoke will change it in the environment.</p>
<h3>Command line arguments</h3>
<p>If the process needs any command line arguments, including VM options, specify them in a string array.  Note that not all of these arguments will necessarily make it to your main method, since the VM executable will parse it first and remove the VM arguments, passing through the program arguments.</p>
<h3>Java System Properties</h3>
<p>System properties can be used to control many aspects of how a VM runs.  You can set them programmatically in your code or you can set set them on the command line by passing <em>-Dkey=value</em>.  Our JavaInvoke implementation will take a Map<string,String> of properties as a convenience argument; all it does is rewrite the map into the command line.</p>
<h3>Process environment</h3>
<p>This is an operating-system level construct.  This is the set of environment variables, also in a Map<string,String> that you would like merged with the current environment.  This would be the place that you set things like LD_LIBRARY_PATH on Unix.</p>
<h2>Dealing with input and output</h2>
<p>So you might ask the question, &#8220;where does the output from the process go?&#8221;  Or more troubling, &#8220;How do I send the process some input?&#8221;  The Java <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Process.html">Process</a> object has methods to deal with this, allowing you to get streams that give you access to the input, output, and error streams of spawned process. That API is straight-forward to deal with, just like any other use of the java.io streams.</p>
<p>However, we want to make the typical case really easy: pulling the output from the spawned process back to the parent that spawned it.  To that end, we add into the mix a class called OutputPiper.  It fires up a thread that pulls all input from the spawned process, tags it with an identifier, and then outputs to the spawner&#8217;s stdout/stderr.</p>
<h3>OutputPiper</h3>
<p>(as extracted from <a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/ProcessSpawner.java.html'>ProcessSpawner.java</a>)</p>
<pre class="brush: java; title: ; notranslate">
	public static class OutputPiper extends Thread  {
		InputStream in;
		PrintStream out;
		String tag = null;

		public OutputPiper(String tag, InputStream in,PrintStream out) {
			this.in = in;
			this.out = out;
			this.tag = tag;
			// make sure that we don't keep the VM alive
			this.setDaemon(true);
			this.setName(&quot;OutputPiper-&quot; + tag);
			out.println(&quot;Starting output piper for tag: &quot; + tag);
			this.start();
		}

		@Override
		public void run() {
			try {
				BufferedReader reader = new BufferedReader(new InputStreamReader(in));
				String line = null;
				do {
					line = reader.readLine();
					if(line != null) {
						out.println(tag + &quot;: &quot; + line);
					}
				}while(line != null);
			}
			catch (Exception e) {
				//
			}
			out.println(&quot;Output piper exiting for tag: &quot; + tag);
		}

		public static OutputPiper createOutputPiper(String tag, InputStream in, PrintStream out) {
			OutputPiper rc = new OutputPiper(tag, in,out);
			return rc;
		}
	}
</pre>
<p>Outpiper extends <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Thread.html">Thread</a> so that all the output will arrive back to the controlling process in a timely manner.  For each given process, we spawn off two OutputPipers, one for stdout and one for stderr, corresponding to the <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Process.html#getInputStream()">Process.getInputStream()</a> and the <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Process.html#getErrorStream()">Process.getErrorStream()</a>.</p>
<h2>ProcessSpawner &#038; JavaInvoke</h2>
<p>There are two key classes in the example:</p>
<ul>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/ProcessSpawner.java.html'>ProcessSpawner.java</a> &#8211; Essentially a wrapper around <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/ProcessBuilder.html">ProcessBuilder</a>, a generic process spawner that makes it simple to invoke processes that that use OutputPipers to forward their output back to their parent. This class allows you to specify the working directory, process environment, and command line for the process to be invoked.</li>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/JavaInvoke.java.html'>JavaInvoke.java</a> &#8211; a specialized subclass of ProcessSpawner, this class makes spawning new VMs a piece of cake, doing the necessary translation for Java system properties, setting the proper classpath environment variable with potential overrides, and fills in the fully qualified class name to run.</li>
</ul>
<h2>The Example &#038; Source Code</h2>
<p>I&#8217;ve put together a running example that implements a trivial client and server in JUnit test.  The setup() method spawns the server and then the tests run the client code against the server, tearing it down after each test.  It&#8217;s available in the <a href='/wp-content/uploads/2009/07/PalantirVMSpawnerExample.zip'>PalantirVMSpawnerExample.zip</a> zip file.  Unzip it, run the <i>run.sh</i> or <i>run.bat</i> script as appropriate.  It should generate output that looks like this:</p>
<pre class="console">
-----------------------------------------------------
Starting test testAck
INFO [main] JavaInvoke - CLASSPATH=./lib/devblog-vmspawner.jar
INFO [main] ProcessSpawner - Build process spawner for the following command line:
INFO [main] ProcessSpawner - /home/pteng/java/i586/jdk1.5.0_14/jre/bin/java com.palantir.blog.processspawner.Server
Starting output piper for tag: server-stdout
Starting output piper for tag: server-stderr
server-stdout: Waiting for connection
server-stdout: Spawning socket handler
server-stdout: Waiting for connection
server-stdout: Spawning socket handler
server-stdout: Waiting for connection
server-stdout: [Socket Handler2]: Got message: some message
server-stdout: Spawning socket handler
server-stdout: Waiting for connection
server-stdout: [Socket Handler3]: Got message: SHUTDOWN
Output piper exiting for tag: server-stdout
Output piper exiting for tag: server-stderr
Finished test testAck
-----------------------------------------------------
-----------------------------------------------------
Starting test testShutdown
INFO [main] JavaInvoke - CLASSPATH=./lib/devblog-vmspawner.jar
INFO [main] ProcessSpawner - Build process spawner for the following command line:
INFO [main] ProcessSpawner - /home/pteng/java/i586/jdk1.5.0_14/jre/bin/java com.palantir.blog.processspawner.Server
Starting output piper for tag: server-stdout
Starting output piper for tag: server-stderr
server-stdout: Waiting for connection
server-stdout: Spawning socket handler
server-stdout: Waiting for connection
server-stdout: Spawning socket handler
server-stdout: Waiting for connection
server-stdout: Spawning socket handler
server-stdout: Waiting for connection
server-stdout: [Socket Handler3]: Got message: SHUTDOWN
Output piper exiting for tag: server-stdout
Output piper exiting for tag: server-stderr
Took 3 ms to send shutdown.
Took 335 ms for process to die.
Finished test testShutdown
-----------------------------------------------------
SUCCESS: all 2 tests passed
</pre>
<p>The source is included in the zip file, but if you wanted to look at it or link to it on the web, here are the classes involved:</p>
<ul>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/Client.java.html'>Client.java</a></li>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/Example.java.html'>Example.java</a></li>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/JavaInvoke.java.html'>JavaInvoke.java</a></li>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/ProcessSpawner.java.html'>ProcessSpawner.java</a></li>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/Server.java.html'>Server.java</a></li>
<li><a href='/wp-content/uploads/vmspawner_html/com/palantir/blog/processspawner/ServerSpawningTest.java.html'>ServerSpawningTest.java</a></li>
</ul>
<p>And as an added bonus, there&#8217;s an Ant <i>build.xml</i> that will let you tweak and rebuild the demo yourself.</p>
<p>Comments and questions welcome.  Enjoy.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/07/28/javainvoke/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

