<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Palantir Technologies &#187; problemspace-government</title>
	<atom:link href="http:///category/problemspace-government/feed/" rel="self" type="application/rss+xml" />
	<link></link>
	<description>Articles from the Engineering Group at Palantir Technologies</description>
	<lastBuildDate>Wed, 14 Dec 2011 17:48:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Inside Horizon: interactive analysis at cloud scale</title>
		<link>http://blog.palantirtech.com/2011/04/15/inside-horizon-interactive-analysis-at-cloud-scale/</link>
		<comments>http://blog.palantirtech.com/2011/04/15/inside-horizon-interactive-analysis-at-cloud-scale/#comments</comments>
		<pubDate>Fri, 15 Apr 2011 19:04:46 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[distributed systems]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[enterprise software]]></category>
		<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1837</guid>
		<description><![CDATA[Late last year, we were honored to be invited to talk at Reflections&#124;Projections, ACM@UIUC&#8217;s annual student-run computing conference. We decided to bring a talk about Horizon, our system for doing aggregate analysis and filtering across very large amounts of data. The video of the talk was posted a few weeks back on the conference website. [...]]]></description>
			<content:encoded><![CDATA[<div style='width: 250; margin-left: 10px; margin-bottom: 10px; float: right;'><a href="http://www.acm.uiuc.edu/conference/2010/"><img src="http://blog.palantir.com/wp-content/uploads/2011/03/reflectionsprojections.png" alt="" title="reflectionsprojections" width="250" height="215"/></a></div>
<p>Late last year, we were honored to be invited to talk at Reflections|Projections, ACM@UIUC&#8217;s annual student-run computing conference.  We decided to bring a talk about Horizon, our system for doing aggregate analysis and filtering across very large amounts of data.  The video of the talk was posted a few weeks back on <a href="http://www.acm.uiuc.edu/Conferenceware/Schedule/Videos">the conference website</a>.</p>
<p>Horizon started as research project / technology demonstrator built as part of Palantir&#8217;s Hack Week &#8211; a periodic innovation sprint that our engineering team uses to build brand new ideas from whole cloth.  It was then used by the Center For Public Integrity in their <a href="http://www.publicintegrity.org/investigations/economic_meltdown/">Who&#8217;s Behind The Subprime Meltdown</a> report.  We produced a short video on the subject, <a href="http://www.palantirtech.com/government/analysis-blog/horizon">Beyond the Cloud: Project Horizon</a>, released on our analysis blog.  Subsequently, it was folded into our product offering, under the name <a href="http://www.palantirtech.com/labs/object-explorer">Object Explorer</a>.</p>
<p>In this hour-long talk, two of the engineers that built this technology tell the story of how Horizon came to be, how it works, and show a live demo of doing analysis on hundreds of millions of records in interactive time.</p>
<p><iframe title="YouTube video player" width="640" height="510" src="http://www.youtube.com/embed/9dOpDeRMTMc" frameborder="0" allowfullscreen></iframe></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/04/15/inside-horizon-interactive-analysis-at-cloud-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Help! Is there a doctor in the network???</title>
		<link>http://blog.palantirtech.com/2010/07/23/help-is-there-a-doctor-in-the-network/</link>
		<comments>http://blog.palantirtech.com/2010/07/23/help-is-there-a-doctor-in-the-network/#comments</comments>
		<pubDate>Fri, 23 Jul 2010 23:33:01 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[palantirtech]]></category>
		<category><![CDATA[problemspace-government]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1427</guid>
		<description><![CDATA[Cyber security is a hot topic, especially in national security circles. The world has witnessed a number of high-profile incidents in the past two years that have been notable for sharing three very important aspects: they were targeted attacks, carried out against specific institutions they were politically motivated, and, inconclusively, appear to be state-sponsored they [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; width: 250px; margin-left: 15px; margin-bottom: 15px;'>
<img src='http://upload.wikimedia.org/wikipedia/commons/thumb/c/c6/Botnet.svg/500px-Botnet.svg.png' width='250'/>
</div>
<p>Cyber security is a hot topic, especially in national security circles.  The world has witnessed a number of high-profile incidents in the past two years that have been notable for sharing three very important aspects: </p>
<ul>
<li>they were targeted attacks, carried out against specific institutions
</li>
<li>they were politically motivated, and, inconclusively, appear to be state-sponsored
</li>
<li>they used multiple-step, multi-vectors attacks and managed to evade existing security countermeasures
</li>
</ul>
<p>This deviates from the types of attacks that IT-centric approaches have sought to defend networks against.  Traditional approaches neutralize the perceived threats against a network with a host of countermeasures: firewalls, malware scanners, automated network vulnerability scanning, patch policies, and intrusion detection systems.  The network defenses can learn new tricks when the administrators update the signatures, or, for certain types of data, employ a <a href="http://en.wikipedia.org/wiki/Bayesian_inference">Bayesian inference</a> strategy (<a href="http://www.paulgraham.com/spam.html">as has been employed to fight spam</a>).  This approach does a good job of protecting against untargeted attacks as well as weak targeted attacks.  </p>
<p>Full network defense requires human analysts looking at anomalies at a level above the automated countermeasures.  Check out the rest of this post to take a look at how <a href="http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/">human-driven, computer-aided analysis is a game changer</a> in cyber security.</p>
<p><span id="more-1427"></span></p>
<h2>A classic doctrine: the immune system</h2>
<p>If you&#8217;ve worked in network security, you&#8217;re undoubtedly familiar with most  (if not all) of the countermeasure systems listed above.  The question we don&#8217;t often ask is: </p>
<blockquote><p>What is the defensive doctrine being employed by this security architecture?</p></blockquote>
<p>Classic network security can be summed up as this philosophy: </p>
<blockquote><p>Become unattractive as a target-of-opportunity to the legions of script kiddies and somewhat more sophisticated opportunists who search for network defenses they can easily breach.  </p></blockquote>
<p>The goal of the IT-based approach is to be a tougher nut to crack than the network next door. Attackers throw themselves against the defenses, find no exploitable vulnerabilities and move on to the next target-of-opportunity. </p>
<p>As the old joke goes: when a tiger attacks your safari group, you don’t have to run faster than the tiger, you just need to run faster than your friends. We might rewrite that today as: <em><a href="http://en.wikipedia.org/wiki/Leet">when the &#8216;l33t h4cker comes a&#8217;knocking in your network neighborhood, just make sure that you&#8217;re less of a n00b than the next guy and you&#8217;ll probably avoid getting pwned too hard</a>.</em></p>
<p>And so we&#8217;re faced with this reality: today&#8217;s state-of-the-art network defense is a patchwork system of automated countermeasures designed to stop dumb, undirected, automated attacks. This architecture is not unique to cyber security &mdash; it has a close analog in biology. </p>
<p>The human immune system produces antibodies that recognize and defend against specific attacks; it learns over time through successful defense of the organism and, more recently, vaccinations. <a href="http://www.nytimes.com/2010/07/13/science/13micro.html?_r=1&#038;pagewanted=all">Millions of bacteria and viruses are foiled every day by immune systems</a>. We can observe this same pattern in cyberspace: hijacked systems tirelessly scour the Internet&#8217;s address space, looking for hapless networks ripe for takeover. <a href='http://blogs.forbes.com/firewall/2010/06/04/just-how-big-is-the-cyber-threat-to-dod/'>The Pentagon is probed something like 250,000 times a day</a>.</p>
<p>It would be insanity to connect a network to the modern Internet without security countermeasures in place to defend against these sort of attacks.  However, while they are necessary to the task of securing a network, they are certainly not sufficient.</p>
<h2>Targeted attacks: slipping past the immune system</h2>
<div style='text-align: center; float: right; width: 250px; margin-left: 15px; margin-bottom: 15px;'>
<a href='http://www.dpd.cdc.gov/DPDx/HTML/Hookworm.htm'><img src='http://www.dpd.cdc.gov/DPDx/images/ParasiteImages/G-L/Hookworm/Hookworm_LifeCycle.gif' width='250'/><br/><br />
<span style='font-size: 0.8em; text-align: center; font-style: italic'>The Lifecycle of Hookworm</span></a>
</div>
<p>The countermeasures discussed thus far are essential but not infallible and can be bypassed by things like never-before-seen viruses or carefully crafted penetration attempts.  In the biological domain a targeted attack might come in the form of <a href="http://www.ncbi.nlm.nih.gov/pubmed/20208540">HIV</a> (evolved to slip past the immune defenses), a toxin (non-biological, nothing the immune system can do), or a parasite.</p>
<h3>The original crafty adversary</h3>
<p>A parasite can survive and thrive inside its host while <a href="http://jbiol.com/content/8/7/62">evading or suppressing the normal immune response to invaders</a> . They take up comfortable residence inside the body of their host, using it as source of food and protection; finally, they use the host as a place to reproduce and spread to other individuals in the host species.  Parasites don&#8217;t generally kill or gravely harm their hosts (or at least they don&#8217;t do it quickly), as it&#8217;s in their own self-interest to have the host continue living.</p>
<h3>Targeted parasite networks: GhostNet and the Shadow network</h3>
<p>Cyber analog?  You betcha: <a href='http://www.google.com/corporate/execs.html#vint'>Vint Cerf</a> was quoted just last week, <a href='http://voices.washingtonpost.com/fasterforward/2010/07/vint_cerf_at_palantir_night_li.html'>&#8220;The hackers don&#8217;t want to destroy the network. They want to keep it running, so they can keep making money from it.&#8221;</a></p>
<p><a href="http://citizenlab.org/">The Citizen Lab</a>, a University of Toronto-based non-profit that does in-depth, hands-on, technical research in the cyber security domain had this to say:</p>
<blockquote><p>Crime and espionage form a dark underworld of cyberspace. Whereas crime is usually the first to seek out new opportunities and methods, espionage usually follows in its wake, borrowing techniques and tradecraft.
</p></blockquote>
<p>That&#8217;s in the foreword from their recent report, &#8220;<a href='http://www.scribd.com/doc/29435784/SHADOWS-IN-THE-CLOUD-Investigating-Cyber-Espionage-2-0'>Shadows in the Cloud: Investigating Cyber Espionage 2.0</a>&#8220;.  The report details their experiences tracking down the size, scope, and tradecraft behind a massive cyber-espionage botnet, dubbed <a href="http://en.wikipedia.org/wiki/GhostNet">GhostNet</a>:</p>
<blockquote style='text-align: justify;'><p><a href='http://www.scribd.com/doc/13731776/Tracking-GhostNet-Investigating-a-Cyber-Espionage-Network'>Tracking GhostNet: Investigating a Cyber Espionage Network</a> <em>[their first report on this botnet]</em> was the product of a ten-month investigation and analysis focused on allegations of Chinese cyber espionage against the Tibetan community. The research entailed field-based investigations in India, Europe and North America working directly with affected Tibetan organizations, including the Private Office of the Dalai Lama, the Tibetan Government-in-Exile, and several Tibetan NGOs in Europe and North America. The fieldwork generated extensive data that allowed us to examine Tibetan information security practices, as well as capture evidence of malware that had penetrated Tibetan computer systems. We also engaged in extensive data analysis and technical investigation of web-based interfaces to command and control servers that were used by attackers to send instructions to, and receive data from compromised computers.</p>
<p>The report documented a wide ranging network of compromised computers, including at least 1,295 spread across 103 countries, 30 percent of which we identified and determined to be &#8220;high-value&#8221; targets, including ministries of foreign affairs, embassies, international organizations, news organizations, and a computer located at NATO headquarters.</p></blockquote>
<p>These attacks used carefully forged email attacks, known as <a href='http://www.fbi.gov/page2/april09/spearphishing_040109.html'>spearphishing</a>, to entice their targets to unknowingly infect themselves with remote control software. The infections allowed the attackers to exfiltrate data from compromised machines and use them as springboards to attack other systems using similar targeted attacks.  <a href="http://www.dpd.cdc.gov/dpdx/html/hookworm.htm">Sound familiar?</a></p>
<h2>A New Doctrine: The Doctor</h2>
<p>Without an immune system, we&#8217;d be dead within hours; our immune system is absolutely necessary but, again,  not sufficient to keep us healthy.  For those things that the immune system can&#8217;t take care of, we use doctors.  Doctors are adaptive adversaries to disease: they can run tests, they can talk to the patient, they can apply insights learned from other patients or diseases.  Most importantly, a doctor has a much more <a href='http://jokesareawesome.com/joke/932/what_s_the_difference_between_god_and_a...'>omniscient view of the patient</a> than the immune system.</p>
<h3>Network Security &#8211; a 10,000 ft. discipline</h3>
<p>Applying this approach to the network enables security responses that can actually counter targeted attacks. A security officer (our network&#8217;s &#8220;doctor&#8221;) starts an investigation with some sort of anomalous event, a unexpected IP address in a log, an alert from intrusion detection system. </p>
<p>Remember that a runny nose or flagged packet is not an illness or a network compromise, it&#8217;s a symptom.  Symptoms suggest causes, but are only clues. Taken in isolation, they don&#8217;t often offer conclusive information on the health of the patient. In fact, finding the root cause of a symptom (a <a href="http://en.wikipedia.org/wiki/Diagnosis">diagnosis</a>) requires the synthesis of multiple sources of data into a complete, coherent picture of the network or patient.  This often includes things that you can&#8217;t see in the blood or packet stream, like understanding where the patient or user has travelled, what environmental factors might be present in their home, existing allergies, open wireless networks, insecure web apps, drug use, etc.</p>
<h3>Node health vs. network health</h3>
<p>A node gets an infection on your network? <a href="http://www.thinkgeek.com/tshirts-apparel/unisex/frustrations/ad98/">Re-image it, the symptoms go away</a>. In the domain of human medicine, re-imaging of humans when they get sick has not yet gained FDA approval &ndash; something doctors have been uttering oaths about since way before the days of Hippocrates.</p>
<p>But it&#8217;s not the symptoms we&#8217;re after, it&#8217;s the root cause.  Couple that with how easy it is to treat the symptoms via re-imaging, and security officers are more akin to public health officials, more concerned about the overall health of the network than the health of a single node.  This broader concern manifests as an instant list of begged questions about any security anomaly on the network:</p>
<ul>
<li>How did this happen?  Was it a machine (network exploit) or human vector (somebody clicked on something they shouldn&#8217;t have)?</li>
<li>What is the extent of this infection?  Is it limited to a single node?  <a href="http://www.youtube.com/watch?v=EVekNsgUqn4">Why does this small moon appear to have a tractor beam locked on to our ship?</a></li>
<li>Is this part of a larger attack?  What is the true target of this attack? <a href="http://www.youtube.com/watch?v=dddAi8FF3F4">Is this a trap?</a></li>
<li>Do the tracks lead out of or deeper into my network? Was this an inside job?  Did I find an intermediary node in a multi-node penetration?
<li>Who is behind this attack and why do they want in? Can I match this modus operandi with any other known attacks on this or other networks?</li>
<li>How do I prevent this sort of attack in the future?  Do I need to deploy new countermeasures, re-architect parts of the network, and/or teach my people to be more careful?</li>
</ul>
<p>The answer to any of these questions does not appear in a single log file on your network, no more than any single antibody can tell you that the H1N1 flu you&#8217;re now infected with came from the grocery clerk who got it from her boyfriend who, in turn, acquired it on his recent trip to Mexico.</p>
<p>The trees don&#8217;t know how big the forest is.</p>
<h3>Cyber security doctors</h3>
<div style='text-align: center; float: right; width: 250px; margin-left: 15px; margin-bottom: 15px;'>
<a href='http://home.uchicago.edu/~bleakley/graphical_summaries/hookworm_paper_in_graphs.html'><img src='http://upload.wikimedia.org/wikipedia/commons/c/c6/Hookworm_Examination.jpg' width='250'/><br/><br />
<span style='font-size: 0.8em; text-align: center; font-style: italic'>A doctor examines a boy looking for hookworm.</span></a>
</div>
<p>The way to find the answers to these questions is to <em><strong>give a skilled, experienced analyst powerful tools to use against all the data about the attack on all of the systems on your network mashed up with relevant data about the messy meatspace that contains the computers, users, and attackers in question</strong></em>.  </p>
<p>You need firewall logs, intrusion detection system logs, malware detection logs, badge logs to determine who had physical access to the network, travel records of where you expect your employees to be logging into the VPN from, and a dozen other sources of data that are unique to this network.</p>
<p>The data is not enough &mdash; they must to be accessible in a way that enable expedient analysis. In most shops, many of the aforementioned data sources exist, but accessing and cross-referencing them requires a high-level of technical fluency in the storage systems themselves, <em>even for a user that has strong grasp of the story that the data are telling</em>.  Some combination of SQL, shell, grep, awk, sed, perl, and <a href="http://en.wikipedia.org/wiki/Visual_inspection">Mk I Eyeball</a> are used to suss out answers from the data.  It&#8217;s a slow, fragile, error-prone game, and the bar is high to even begin playing.</p>
<p>Whenever computers are recording information about the activities of other computers, the data gets big and it gets big fast. For example, grep is a very powerful and flexible tool, but its linear search through data starts to falter as the data size exceed about 10 GB on rotational media.</p>
<p>In order address and solve these sorts of problems, the world needs a platform with the following properties:</p>
<ul>
<li>Has access to all known information about a given incident</li>
<li>Makes querying and exploring relationships conceptual and interactive</li>
<li>Scale to handle large data sizes</li>
</ul>
<p>It probably looks something like this:</p>
<div style='text-align: center;'>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="480" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="id" value="banner" /><param name="quality" value="high" /><param name="bgcolor" value="#000000" /><param name="src" value="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/cyber/cyber1.flv" /><embed src="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/cyber/cyber1.flv" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="480" movieurl="http://media.palantirtech.com/government/videos/cyber/cyber1.flv"/></object> </p>
<p><a href="http://media.palantirtech.com/government/videos/cyber/cyber1.wmv">Download</a> the WMV (50 MB) | <a href="http://media.palantirtech.com/government/videos/cyber/cyber1.asx">Streaming Windows Media</a></p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2010/07/23/help-is-there-a-doctor-in-the-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A rigorous friction model for human-computer symbiosis</title>
		<link>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/</link>
		<comments>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 03:18:52 +0000</pubDate>
		<dc:creator>Asher Sinensky</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[javatech]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[softwarephilosophy]]></category>
		<category><![CDATA[user interface]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1344</guid>
		<description><![CDATA[This is a response to Ari&#8217;s awesome post on human-computer symbiosis. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible&#8230; let&#8217;s take a look: We are attempting to understand the total analytic capability for a given task a of a human-computer [...]]]></description>
			<content:encoded><![CDATA[<div style='text-align: center; float: right; margin-left: 15px; margin-right: 15px'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/graph.png" alt="" width="300"/>
</div>
<p>This is a response to <a href="http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/">Ari&#8217;s awesome post on human-computer symbiosis</a>. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible&#8230; let&#8217;s take a look:</p>
<p>We are attempting to understand the total analytic capability for a given task <strong><em>a</em></strong> of a human-computer team. Analytic capability in this case probably means:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq1.png" alt="eq1"/>(1)
</div>
<p>Where <strong><em>A</em></strong> is the answer to the analytic problem in question and <strong><em>t<sub>A</sub></em></strong> is the time needed to arrive at the answer based on the inputs available. In the case of chess, <strong><em>A</em></strong> could be the optimum next move given all previous information and <strong><em>t<sub>A</sub></em></strong> would be how long it takes to decide on this move.</p>
<p>Read on for a look at how this generalizes in human-computer symbiotic systems.<br />
<span id="more-1344"></span></p>
<p>In the case of the human-computer team, we know that <strong><em>a </em></strong>is going to be a function of both the human&#8217;s analytical capability <strong><em>h</em></strong> and the computer&#8217;s analytical capability <strong><em>c</em></strong> (where both <strong><em>h</em></strong> and <strong><em>c</em></strong> have units of answers/time). In the limit case we know that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq2.png" alt="eq2"/>(2)
</div>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq3.png" alt="eq3"/>(3)
</div>
<p>Or in plain English, if there is no human present, the total analytic capability is simply the analytic capability of the computer. So the naïve solution would be that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq4.png" alt="eq4"/>(4)
</div>
<p>(4) clearly meets the limiting cases described in (2) and (3). Kasparov noticed a mixing function where the ability of the human and computer to work together becomes the dominant term &mdash; we might call this the mixing capability for the given task or <strong><em>m</em></strong>. Including this phenomenon, the total analytic capability (4) would be re-defined as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq5.png" alt="eq5"/>(5)
</div>
<p>where <strong><em>m</em></strong> has the property that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq6.png" alt="eq6"/>(6)
</div>
<p>Thus maintaining the limits expressed in (2) and (3) and adhering to the observation that if there is no human or computer component then there will be no mixing advantage. A naïve solution to this constraint would be simple linear mixing:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq7.png" alt="eq7"/>  (7)
</div>
<p>where <strong><em>M</em></strong> (units of time per answer) is the mixing efficiency and will be primarily based on the type of task being solved &mdash; some analytical tasks lend themselves to a combined process more than others (for example, multiplying 20 digit numbers does not really benefit from the intuition of a human so the ability of a human and computer to perform this task is merely their additive ability). </p>
<p>What Kasparov noticed is that the mixing was primarily based on the quality of the process rather than the analytical power of either the human or computer separately. This seems to imply that we must somehow account for the fact that the quality of the human-computer interface is responsible for the quality of the mixing. This can be modeled as a unitless friction of interaction <strong><em>f<sub>i</sub></em></strong> that impedes the ability of the human and computer to work together. </p>
<p>Equation (7) can thus be re-written as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq8.png" alt="eq8"/>(8)
</div>
<p>In this case, the maximum value for the mixing capability is realized when the friction of interaction goes to zero. This mixing capability is the same as the equation Ari developed (less the coefficient which is necessary to maintain consistent units throughout).</p>
<p>We can now re-write our analytic capability in (5) as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq9.png" alt="eq9"/>(9)
</div>
<p>Below, see a plot of this function over a range of values for <strong><em>h</em></strong>, <strong><em>c</em></strong> and <strong><em>f<sub>i</sub></em></strong>:</p>
<div style='text-align: center; margin: auto; margin-bottom: 1em;'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/graph.png" alt=""/>
</div>
<p>As can clearly be seen from this functional plot (note the vertical scale), the effect of interface friction dominates over the other terms whenever both the human and computer can make important contributions to the task at hand. The conclusion can be drawn that the most effective way to solve analytical problems is to minimize the friction of the human-computer interface; or to put it another way: optimal analytical systems are those that are built specifically to maximize the ability of the human to leverage the ability of the computer.</p>
<p>I am certain there is still the possibility for further refinement, for example:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq10a.png" alt="eq10a"/>(10)
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Haiti: effective recovery through analysis</title>
		<link>http://blog.palantirtech.com/2010/04/05/haiti-effective-recovery-through-analysis/</link>
		<comments>http://blog.palantirtech.com/2010/04/05/haiti-effective-recovery-through-analysis/#comments</comments>
		<pubDate>Mon, 05 Apr 2010 21:58:56 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1336</guid>
		<description><![CDATA[Visualizing SMS hotspots in days following the earthquake in Palantir. Screenshot courtesy of Palantir Technologies [Editor's Note: an edited version of this post first appeared on O'Reilly's Radar blog.] The prologue was an earthquake of unexpected magnitude and location that left 250,000 dead. As computer scientists and technologists, we&#8217;re used to dealing with large numbers [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left:15px; margin-bottom: 15px; text-align: center; width: 300px'><a href="/wp-content/uploads/2010/03/haiti.png"><img src="/wp-content/uploads/2010/03/haiti-thumb.png" alt="" title="haiti" width="300" height="210" class="alignnone size-medium wp-image-1468" /></p>
<div style='text-align: center; font-size: 0.8em'>Visualizing SMS hotspots in days following the earthquake in Palantir.</div>
<div style='text-align: center; font-size: 0.6em'>Screenshot courtesy of Palantir Technologies</div>
<p></a></div>
<p><em>[Editor's Note: an edited version of this post <a href="http://radar.oreilly.com/2010/04/good-data-cuts-through-the-cha.html">first appeared on O'Reilly's Radar blog</a>.]</em></p>
<p>The prologue was an earthquake of unexpected magnitude and location that left 250,000 dead.</p>
<p>As computer scientists and technologists, we&#8217;re used to dealing with large numbers in the abstract. Expressed in human terms, the mind-boggling numbers of 250,000 dead, 300,000 injured and over 1 million people left homeless are hard to comprehend. </p>
<p>Hit the link to read more about how effective data management and analysis is crucial to recovery efforts and see specific examples of data about the situation in Haiti modeled in Palantir Government.<br />
<span id="more-1336"></span></p>
<h2>Chapter One: Rescue</h2>
<p>There was one glimmer of hope in this sea of tragedy: the world&#8217;s reaction. In the early hours and days after the quake, the focus was on pinpointing, triaging, and rescuing those in grave danger. Since those first harrowing hours, <a href="http://en.wikipedia.org/wiki/Humanitarian_response_to_the_2010_Haiti_earthquake"> the world has made plain its willingness to help the people of Haiti</a>.  Supplies of money, food, medicine, fresh water, and volunteers have been pouring into Haiti and fundraising efforts are on-going around the world.</p>
<p>Technology also played an early, crucial role, with <a href="http://www.mission4636.org/">Mission 4636</a>, <a href="http://instedd.org/">InSTEDD</a>  and <a href="http://haiti.ushahidi.com/reports/submit">Ushahidi</a> reacting lighting-fast to create a data collection system that enabled people in trouble to quickly communicate their urgent needs to rescuers and relief workers . If you haven&#8217;t already read it, Lukas Biewald&#8217;s piece, <a href="http://radar.oreilly.com/2010/03/how-crowdsourcing-helped-haiti.html">How crowdsourcing helped Haiti&#8217;s relief efforts</a>, is a great look at those first, and most urgent efforts to collect data and synthesize information about the situation on the ground.</p>
<h2>Chapters Two Through Many: Recovery</h2>
<div style='float: right; margin-left:15px; margin-right: 15px'>
<a href="/wp-content/uploads/2010/03/p-hti0366.jpg"><img src="/wp-content/uploads/2010/03/p-hti0366-thumb.jpg" alt="" title="p-hti0366" width="300" height="200" class="alignnone size-medium wp-image-1475" /></a><br/></p>
<div style='text-align: center; font-size: 0.8em'>The extent of the devastation in Haiti.</div>
<div style='text-align: center; font-size: 0.6em'>Photo courtesy of Marko Kokic / ICRC / American Red Cross</div>
</div>
<p>Unfortunately, even partial recovery in Haiti will take years at the bare minimum.<a href="http://www.miamiherald.com/2010/01/17/1429872/vice-president-joe-biden-stresses.html"> U.S. Vice President Joe Biden stated on 16 January</a> that President Obama &#8220;does not view this as a humanitarian mission with a life cycle of a month. This will still be on our radar screen long after it&#8217;s off the crawler at CNN. This is going to be a long slog.&#8221;   </p>
<h3>Building the Deep, Big Picture</h3>
<p>The recovery from a disaster of this magnitude presents some important tasks in the sphere of information technology: coordination of effort, triaging those most in need, and getting good data into the hands of decision makers and aid workers.</p>
<p>Here&#8217;s a partial list of aid, relief, and rescue organizations currently in Haiti, gleaned from <a href="http://en.wikipedia.org/wiki/2010_Haiti_earthquake#Rescue_and_relief_efforts">Wikipedia</a>: </p>
<ul>
<li>An Argentine military field hospital</li>
<li>The Red Cross/Crescent, in various forms</li>
<li>The US military</li>
<li>Multiple UN agencies</li>
<li>Remnants of the Haitian government</li>
<li>The French navy</li>
<li>Sri Lankan relief workers</li>
<li>At least 2000 rescuers from 43 different groups (along with 161 search dogs)</li>
</ul>
<p>A wealth of collaborators like this presents some unique challenges around information fusion: unlike business competitors or opposing sides of a war, the different groups <em>want</em> to share as much information as possible to achieve their common goal.  A unified organization, like a single national military will have pre-existing methods to model and share their <a href="http://en.wikipedia.org/wiki/Situational_awareness">situational awareness</a>. That is not the case in Haiti: this is a collection of groups coming together to form an ad-hoc relief force. Everything from differences in human languages, database schema, collection methodology, and problem domain make most of the datasets seemingly disjoint from the others.</p>
<p>However, each organization has a produced a fairly detailed picture of the parts of Haiti that they are interacting with.  Each organization also wants to consume every other&#8217;s organization&#8217;s detailed knowledge of the situation.  To act effectively, they need to integrate that knowledge into a common operating picture that accurately models the situation on the ground yesterday, today, and tomorrow.</p>
<h3>Analyzing the Haiti situation using Palantir Government</h3>
<p>Our reaction to the earthquake was to try to help in the best way we knew how.  We set up a <a href="http://haiti.paas.palantirtech.com/">publicly available instance of our Palantir Government product</a>, already loaded with relevant data, for use by aid workers and organizations working in Haiti.  Using relevant, open-source data we&#8217;ve started modeling a picture of what&#8217;s going in Haiti.  </p>
<p>Our first cut was to include the locations and names of collapsed buildings, Internally Displaced People (IDP) camps, and Misson 4636 SMS messages, among others.  We also added in map layers that let us see what administrative zone any point on the map is located in.</p>
<p>Having mapped the data into this model, users have access to it through a suite of visualization, analysis, querying, and collaboration tools that allow them to get useful answers to practical questions.  Here are some examples:</p>
<ul>
<li>Which administrative sectors have had the most SMS requests for food in the past 24 hours?</li>
<li>What collapsed buildings are there that may contained hazardous materials that will require special cleanup?</li>
<li>Are any IDP camps near enough to these hazmat sites to warrant special precautions or moving the residents?</li>
</ul>
<p>We&#8217;ve created a video showing all the pieces put together into a seamless whole, using live data in our publicly available Haiti instance:</p>
<div style='text-align: center;'>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="480" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="id" value="banner" /><param name="quality" value="high" /><param name="bgcolor" value="#000000" /><param name="src" value="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=hhttp://media.palantirtech.com/government/videos/haiti/haiti2.flv" /><embed src="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/haiti/haiti2.flv" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="480" movieurl="http://media.palantirtech.com/government/videos/haiti/haiti2.flv"/></object>
</div>
<h2>The Next Chapter: Flooding</h2>
<div style='float: right; margin-left:15px; margin-bottom: 15px; width: 300px; text-align: center;'>
<a href="/wp-content/uploads/2010/03/p-hti0508.jpg"><img src="/wp-content/uploads/2010/03/p-hti0508-thumb.jpg" alt="" title="p-hti0508" width="300" height="199" class="alignnone size-medium wp-image-1472" /></a><br/></p>
<div style='text-align: center; font-size: 0.8em'>A view of the water point at the Citee Renault camp  in Port-au-Prince, Haitii</div>
<div style='text-align: center; font-size: 0.6em'>Photo courtesy of Joe Lowry / IFRC / American Red Cross</div>
</div>
<p>From the <a href="http://www.redcross.org/portal/site/en/menuitem.1a019a978f421296e81ec89e43181aa0/?vgnextoid=0fe6e0b8da8b6210VgnVCM10000089f0870aRCRD">Red Cross website</a>: </p>
<blockquote><p>      “We’re racing against the clock with hurricane season just around the corner,” said Jean Pierre Taschereau, a Red Cross disaster expert just back from Haiti. “Getting semi-permanent structures in place as well as trenches for sanitation latrines will be critically important.”</p></blockquote>
<p>From <a href='http://www.abc.net.au/lateline/content/2010/s2844832.htm'>&#8220;Quake-ravaged Haiti faces flooding&#8221;</a>:</p>
<blockquote><p>
The UN wants to move 200,000 people out of overcrowded camps like this one. The Haitian government is trying to find land. It&#8217;s identified five sites outside of the Haitian capital, but those five sites are about 200 hectares and by the UN&#8217;s estimates 600 hectares will be needed to house the people it plans to move safely to have proper drainage when the rainy season finally arrives.
</p></blockquote>
<p>Haiti&#8217;s rainy season is notorious for causing flooding.  Now, with the infrastructure of the country destroyed, flood season will be more dangerous than usual.  Not only are the normal structures that protect people from the waters gone, but they&#8217;ve moved out of the ruins of Port-au-Prince to hastily constructed IDP camps, some of which are sitting in the flood plains of Haiti&#8217;s waterways.</p>
<p>The essential question facing relief workers: <em>Which of the approximately 2500 IDP camps are most at risk from flooding?</em></p>
<p>In a place like the United States, an earthquake response and recovery team could engage the services and expertise of the US Geological Survey,  which maintains the <a href="http://waterdata.usgs.gov/nwis">National Water Information System</a>, a warehouse of detailed information about all things water in this country. No such luck in Haiti, where the closest thing to the USGS is the <a href="http://www.cnigs.ht/">Centre National de l&#8217;Information Géo-Spatiale</a>.  A quick look at their website shows that they didn&#8217;t really make it through the earthquake. (In the video, we feature a picture of what&#8217;s left of their facility &mdash; it&#8217;s not pretty).</p>
<p>Since we&#8217;re starting from square one we put together data from the <a href="http://www.agc.army.mil/Haiti/index.html">Army Geospatial Center</a>, <a href="http://ochaonline.un.org/tabid/6412/language/en-US/Default.aspx">the UN</a>, <a href="http://www.noaanews.noaa.gov/stories2010/20100119_haiti.html">NOAA</a>, Haiti-based NGOs, a number of academic papers, and even <a href="http://www.flickr.com/photos/tags/earthquake/map?&#038;fLat=18.5873&#038;fLon=-72.3666&#038;zl=6&#038;order_by=recent">geo-tagged photos from Flickr</a>.  The time it took to integrate this data? About six hours.  Time it took to do the analysis?  About seven minutes.  Amount of that work that is reusable?  All of it.</p>
<p>Check out this video for a walk-through of the analysis:</p>
<div style='text-align: center;'>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="480" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="id" value="banner" /><param name="quality" value="high" /><param name="bgcolor" value="#000000" /><param name="src" value="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv" /><embed src="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="480" movieurl="http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv"/></object>
</div>
<p>The best way to improve this analysis will be to add more detailed information about flooding, gathered from the field.  We&#8217;re looking into getting new conduits of information into the Haiti instance to make this a reality as the rains really pick up.</p>
<h2>A Call To Action</h2>
<p>If you&#8217;d like to help us, we&#8217;re accepting new data sources, analyses, and contact with relief organizations.</p>
<p>Volunteers, supplies, and goodwill are only the raw ingredients to recovery; it&#8217;s the efficient and timely application of those resources to Haiti&#8217;s most pressing problems that will change lives and make recovery a reality instead of just a good intention.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2010/04/05/haiti-effective-recovery-through-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Friction in Human-Computer Symbiosis: Kasparov on Chess</title>
		<link>http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/</link>
		<comments>http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/#comments</comments>
		<pubDate>Mon, 08 Mar 2010 19:32:06 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1302</guid>
		<description><![CDATA[As we build our platforms and applications following a human-computer symbiosis approach, we keep an ear to the ground for interesting examples that illuminate new techniques or validate our approach in some empirical way. One of the areas that we&#8217;re interested is in the overall friction of analysis systems. The systems that we build are [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px;'>
<img src='/wp-content/uploads/2010/03/fools-mate.gif'/>
</div>
<p>As we build our <a href="http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/">platforms</a> and <a href="http://blog.palantirtech.com/2009/09/29/the-palantir-technologies-demo-reel-screenshots-round-3/">applications</a> following a <a href="http://en.wikipedia.org/wiki/Intelligence_amplification">human-computer symbiosis</a> approach, we keep an ear to the ground for interesting examples that illuminate new techniques or validate our approach in some empirical way.</p>
<p>One of the areas that we&#8217;re interested is in the overall friction of analysis systems.  The systems that we build are built on commodity hardware &mdash; we&#8217;re not building faster computers and yet we can deliver orders-of-magnitude better performance on analysis tasks than existing solutions.  How do we do this?  By building software in such a way that it reduces the friction experienced at the boundaries between the computing power, the analyst,  and the source data.</p>
<h2>Chess as analysis laboratory</h2>
<p>Chess is, at its heart, a predictive venture.  The player attempts to anticipate their opponent&#8217;s moves, planning their own moves accordingly, with the straightforward goal of finding a sequence of piece moves that force checkmate. </p>
<p>This game is, in its ideal form, analysis. (The moves made are the logical extension of the analysis.)  The data are clean, the problem is well-defined and everyone plays by the same rules.  There are even <a href="http://en.wikipedia.org/wiki/Elo_rating_system">well-defined metrics for ranking chess players by skill</a> &mdash; a better chess player is a better chess-game analyst.  </p>
<p>In the realm of evaluation of analysis systems, this is as about as good as it gets in terms of designing controlled experiments to study the relative strengths of different analysis systems.</p>
<p><a href="http://en.wikipedia.org/wiki/Garry_Kasparov">Garry Kasparov</a>, widely considered to be the greatest chess player of all time,  recently wrote <a href="http://www.nybooks.com/articles/23592">a review of Diego Rasskin Gutman&#8217;s book</a>, <a href="http://www.amazon.com/Chess-Metaphors-Artificial-Intelligence-Human/dp/026218267X"><u>Chess Metaphors: Artificial Intelligence and the Human Mind</u>.</a></p>
<p>The review is excellent and covers a lot of ground.  However, one particular anecdote stood out as a very interesting example of human-computer symbiosis (emphasis added):</p>
<blockquote><p>In 2005, the online chess-playing site Playchess.com hosted what it called a &#8220;freestyle&#8221; chess tournament in which anyone could compete in teams with other players or computers. Normally, &#8220;anti-cheating&#8221; algorithms are employed by online sites to prevent, or at least discourage, players from cheating with computer assistance. (I wonder if these detection algorithms, which employ diagnostic analysis of moves and calculate probabilities, are any less &#8220;intelligent&#8221; than the playing programs they detect.)</p>
<p>Lured by the substantial prize money, several groups of strong grandmasters working with several computers at the same time entered the competition. At first, the results seemed predictable. The teams of human plus machine dominated even the strongest computers. The chess machine Hydra, which is a chess-specific supercomputer like Deep Blue, was no match for a strong human player using a relatively weak laptop. Human strategic guidance combined with the tactical acuity of a computer was overwhelming.</p>
<p>The surprise came at the conclusion of the event. <em>The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time.</em> Their skill at manipulating and &#8220;coaching&#8221; their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. <em>Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.</em></p></blockquote>
<p>After the jump, we look at this finding in a more generalized way and map it onto the Palantir approach.<br />
<span id="more-1302"></span></p>
<h2>The cyborg Grandmaster: a fearsome opponent</h2>
<p>The tournament Kasparov recalls was a showcase of chess talent, human-computer symbiosis, and raw computing power.  Among those entered  in the tournament were a purpose-made chess machine (similar to <a href="http://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)">Deep Blue</a>) named <a href="http://en.wikipedia.org/wiki/Hydra_(chess)">Hydra</a> and a team of <a href="http://en.wikipedia.org/wiki/Grandmaster_(chess)">Grandmasters</a> assisted by computer programs.</p>
<p>One losing participant had this to say about the computer-aided Grandmasters:</p>
<blockquote><p>
Secondly, I have learned that a <a href="http://en.wikipedia.org/wiki/Grandmaster_(chess)">Grandmaster</a> armed with a chess engine is a killer combination against a plain Engine. Engines see everything via brute force, Grandmasters use their intuition and are able to see &#8220;obvious&#8221; moves at once. So the two of them together are a mighty force.
</p></blockquote>
<p>This is just as Licklider predicted 50 years ago &#8212; quoting <a href="http://blog.palantirtech.com/man-computer-symbiosis/">Man-Computer Symbiosis</a> (if I could put it better, I would):</p>
<blockquote><p>
Men will set the goals and supply the motivations, of course, at least in the early years. They will formulate hypotheses. They will ask questions&#8230; In general, they will make approximate and fallible, but leading, contributions, and they will define criteria and serve as evaluators, judging the contributions of the equipment and guiding the general line of thought.</p>
<p>&#8230;</p>
<p>In addition, the computer will serve as a statistical-inference, decision-theory, or game-theory machine to make elementary evaluations of suggested courses of action whenever there is enough basis to support a formal statistical analysis. Finally, it will do as much diagnosis, pattern-matching, and relevance-recognizing as it profitably can, but it will accept a clearly secondary status in those areas.
</p></blockquote>
<p>So in classic intelligence amplification fashion, having computer programs that can quickly evaluate a move&#8217;s likelihood of success can <em>amplify the power of the Grandmaster</em>.</p>
<p>While empirically true, it does beg the question: how <em>much</em> does it amplify the power of the Grandmaster?</p>
<p>One approximation might be product as a simple linear amplification.  Let&#8217;s imagine a function, <em>a(h,c)</em>, in which the analytic power (<em>a</em>) is the product of power of the human (<em>h</em>) and the computing power of the chess engine being used (<em>c</em>).  This gives us the equation:</p>
<div style='text-align: center'>
<img src='/wp-content/uploads/2010/03/hcs-eq-simple.png'/>
</div>
<h2>One term to dominate them all: friction-of-interface</h2>
<p>Does this simple approximation hold up?  It does not. The team that won the <a href="http://www.chessbase.com/newsdetail.asp?newsid=2461">PAL/CSS Freestyle Tournament in 2005</a> was composed of two amateur chess players that were able to best a computer-assisted Grandmaster.</p>
<p>How did  they accomplish this feat?  It was not through superior compute power.  Instead, they did so by more effectively feeding insights to their three chess engines. They played so well that a large number of people actually assumed that it was actually Kasparov himself playing:</p>
<blockquote><p>
Many speculated that it might be Garry Kasparov, who was the initiator of this kind of computer assisted chess matches. When we asked him Kasparov confirmed that was not the case. But he reminded us that it doesn&#8217;t really matter. The guiding principle of Freestyle Chess: anything is allowed. &#8220;Even if they were assisted by the devil, that would probably be covered by the rules,&#8221; he joked. &#8220;Only the moves they played count.&#8221;
</p></blockquote>
<p>What does this mean for our simple equation? Well, it looks it&#8217;s missing a term, one we&#8217;ll call <em>f</em>, that describes the efficiency or <strong>friction</strong> of the interface between human and computer.</p>
<p>Quoting Kasparov again:</p>
<blockquote><p>
<em>Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.</em>
</p></blockquote>
<p>The implication being that the equation actually looks like this:</p>
<div style='text-align: center'>
<img src='/wp-content/uploads/2010/03/hcs-eq-variable-h.png'>
</div>
<p>So as the friction of the interface goes to zero, the full amplification of the chess engine is brought to bear.  A quick gut-check in the opposite direction agrees: one can imagine the world&#8217;s most powerful chess engine with the world&#8217;s worst interface; spending the time it would take to express commands to this theoretically awful program would actually be worse than playing without it.</p>
<h2>Palantir: a low-friction interface to data</h2>
<p>As analysis problems go, chess resembles <a href="http://en.wikipedia.org/wiki/Spherical_cow">a spherical cow in a vacuum</a>.  Analysis problems in the real world are orders of magnitude messier.</p>
<p>Let&#8217;s reframe the terms of our equation above into a more general approach to analysis:</p>
<ul>
<li><em>H</em> &#8211; this is power of the analyst.  In chess, the value of this terms varies widely between players; in designing real-world data analysis systems, this is more or less a constant (which is why <em>h</em> above becomes <em>H</em> below).  Of course there are differing levels of expertise, training, and raw ability amongst the user population, but when we design systems, it&#8217;s with the average case in mind.</li>
<li><em>c</em> &#8211; computing power. How fast are the machines?  How well do they scale?  How efficiently do they perform the data tasks at hand? Palantir spends significant engineering effort on optimizing the <em>c</em> term, but most of the growth in this term comes from the layers we depend on, built by companies like Intel, Sun, Oracle, etc.</li>
<li><em>f</em> &#8211; friction.  How easy is it to bring <em>c</em> to bear on the problem? Note that when we talk about <em>friction of interface</em>, this is not exclusively referring to user interface.  More generally, friction can be present at any interface between two systems: data-software, software-software, human-software, etc. The <em>f</em> that we consider in this simple model is sum total system friction.</li>
</ul>
<p>So our final formulation is just in terms of <em>c</em> and <em>f</em> (holding <em>H</em> as a constant): </p>
<div style='text-align: center'>
<img src='/wp-content/uploads/2010/03/hcs-eq-final.png'>
</div>
<p>When we discuss friction in real-world analysis systems, the friction actually exists at multiple levels:</p>
<ol>
<li>Creating an analysis model that will enable answering the questions that need to be explored</li>
<li>Integrating the data into a single coherent view of the problem</li>
<li>Enabling analysis tools to efficiently query and load the data</li>
<li>Exposing APIs that allow developers to develop custom solutions quickly and efficiently for modeling and analysis tasks not covered by general tools</li>
<li>User interface that makes the tools easy, enjoyable, and quick to use</li>
</ol>
<h3>Minimizing <em>f</em>: Haiti Flooding Predictions</h3>
<p>If this is starting to sound very similar to Palantir&#8217;s marketing information, this is no accident. While some of our backend engineers are concerned with things like scaling and speed-of-querying, the overall innovation that we&#8217;re bringing to the field is not simply about faster data processing systems (even if they are) but reducing the friction at every interface inside a complex human-computer symbiotic system.</p>
<p>You want an example that ties it all together?  It starts with a simple question: which of the many displaced-person camps in Haiti are most at risk for flooding as the rainy season approaches?  Easy to ask, but not so simple to answer. </p>
<p>The original introduction to this video: </p>
<blockquote><p>As we enter the beginning of the rainy season in Haiti, one of the biggest problems facing relief organizations today is the spectre of flooding and mudslides destroying Internally Displaced Persons (IDP) Camps. In this video, we integrate data from many sources to determine high risk aid locations.
</p></blockquote>
<p>The data integration for this video took about six hours, using sources of data that had never before been fused.  The analysis itself takes a few minutes and quickly comes to an actionable answer to the original question.</p>
<div style='text-align: center;'>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="480" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="id" value="banner" /><param name="quality" value="high" /><param name="bgcolor" value="#000000" /><param name="src" value="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv" /><embed src="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="480" movieurl="http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv"/></object>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Palantir: like an operating system for data analysis</title>
		<link>http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/</link>
		<comments>http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/#comments</comments>
		<pubDate>Sat, 07 Nov 2009 03:21:44 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1198</guid>
		<description><![CDATA[If you&#8217;ve taken the time to peruse the Palantir Government analysis blog, you&#8217;ve seen numerous examples of Palantir Government as applied to interesting problems; they are recorded screen captures of our analysis desktop client. It&#8217;s a showcase of useful, meaningful, and compelling visual and semantic tools being used to do analysis on a wide range [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/VisiCalc'><img src='/wp-content/uploads/2009/11/visicalc.png' width='250'/></a>
</div>
<p>If you&#8217;ve taken the time to peruse the Palantir Government <a href='http://www.palantirtech.com/government/analysis-blog'>analysis blog</a>, you&#8217;ve seen numerous examples of Palantir Government as applied to interesting problems; they are recorded screen captures of our analysis desktop client.  It&#8217;s a showcase of useful, meaningful, and compelling visual and semantic tools being used to do analysis on a wide range of datasets.</p>
<p>What enabled this analysis? Aside from the <a href="http://blog.palantirtech.com/2009/09/29/the-palantir-technologies-demo-reel-screenshots-round-3/">obvious hard work of our UI and analysis tools teams</a>, it&#8217;s the flexibility and power of the Palantir data platform.  More than just a scalable datastore, the Palantir data platforms act as robust and clean abstractions on top of data.</p>
<p>One of the early architecture decisions that we made when building both <a href="http://www.palantirtech.com/government">Palantir Government</a> and <a href="http://www.palantirfinance.com/">Palantir Finance</a> was to separate the respective data platforms from the end-user applications used to actually perform analysis.  More than just following the client-server model, this separation made the data servers in both products into generic intelligence infrastructure for analytic problems, with our clients acting as analysis applications on top of those platforms.</p>
<p>And so, one way to look at our data platform is as an operating system for analytic applications.  In this post we&#8217;ll explore the history of operating systems, understand why they&#8217;re so important and see how the Palantir data servers deliver the same potential to revolutionize the writing of analysis software that operating systems did to the writing of general programs for computers.</p>
<p><span id="more-1198"></span></p>
<h2>The OS: abstraction that begat a paradigm</h2>
<p>In the early days of computing, when a programmer wanted to write a program, they had to understand the inner workings of the machine. Writing a program required understanding things like the bus interface of a specific model of hard drive when all that was needed by the program was the clean abstraction of a filesystem. The upshot of this is that much of the time and effort put into a given task was spent writing code to interface with the &#8220;physical&#8221; minutiae of the machine rather than implementing the solution to the problem that the programmer was trying to solve with their software.</p>
<p>This pattern was observed by  <a href="http://en.wikipedia.org/wiki/J._C._R._Licklider">J.R. Licklider</a> and noted in his influential paper, <i><a href="http://blog.palantirtech.com/man-computer-symbiosis/">Man-Computer Symbiosis</a></i> (emphasis added):</p>
<blockquote><p>
<b>About 85 per cent of my “thinking” time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it.</b> Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so.<br />
…<br />
<b>Throughout the period I examined, in short, my “thinking” time was devoted mainly to activities that were essentially clerical or mechanical</b>: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. <b>Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability.</b>
</p></blockquote>
<p>This description of his time as a researcher was echoed in the work of the early programmers: they spent much of their programming time re-inventing the wheel and writing routines that were doing essentially clerical or mechanistic work related to the functioning of the hardware rather the core functions of their programs.</p>
<p>The operating system changed all that: suddenly (and by that I mean: with years of hard work, research, and incremental change) that noisy, inconsistent pile of hardware was transformed into a set of clean abstractions. The programmer was finally freed to spend time and energy on the problem they were really trying to solve.</p>
<p>And so we come to the modern era: dealing with the messy details of hardware has been replaced by the clean and robust abstraction of the operating system.</p>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/Operating_system'><img src='/wp-content/uploads/2009/11/250px-operating_system_placementsvg.png' width='250'/></a>
</div>
<p>Three important properties of modern operating systems:</p>
<ul>
<li><b>Hard boundaries between OS functions and process functions</b> &#8211; in modern operating systems, this is usually accomplished with system calls.  The process places the inputs to the system call in a known location and then asks the OS to perform some operation, like writing to a file or making a network connection.  The OS may or may not perform the function, based on things like permissions, availability of resources, etc.
<p>The most important feature here is that the process never has direct access to the true resources of the machine &mdash; instead, all access to the machine&#8217;s resources are brokered by the OS.
</li>
<li><b>Extensions of the abstraction in every direction</b> &#8211; An OS like Linux is really, at its core, a kernel that does process scheduling and lifecycle, manages memory, and services system calls. Everything else is handled by some sort of driver.  A driver might also be called, more generically, a plugin or extension.  Drivers exist for everything from block devices (like hard drives), network cards, and filesystems to input devices and displays.</li>
<li><b>Designed as a general purpose framework</b> &#8211; the operating system <i>doesn&#8217;t actually do any computing</i>; rather, it&#8217;s a set of services to facilitate processes using the resources of the computer.  To that end, they&#8217;re not designed with a specific process in mind, but rather to serve a large class of programs, each designed and written to accomplish a different task using a similar set of resources.</li>
</ul>
<h2>Analysis: the modern computing task</h2>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/ENIAC'><img src='http://upload.wikimedia.org/wikipedia/commons/archive/4/4e/20050923152626!Eniac.jpg' width='250'/></a></div>
<p>The first computer, <a href="http://en.wikipedia.org/wiki/ENIAC">ENIAC</a>, was conceived to do calculation of ballistics tables for artillery pieces &mdash; it was a glorified calculator. Lacking anything even resembling an operating system, it would just run its program. Its compiler? A group of six women who would configure the machine by hand with the program logic.  The input for its first test run, a calculation related to the hydrogen bomb project, was approximately <i>one million punch cards</i>.</p>
<p>Times have changed: 40 or so years of the unrelenting march of Moore&#8217;s Law in computing power has given us something like an <b><a href="http://upload.wikimedia.org/wikipedia/commons/thumb/c/c5/PPTMooresLawai.jpg/596px-PPTMooresLawai.jpg">eight order of magnitude increase</a></b> in the amount of computing power available per unit cost.  Coupled with similar,<a href="http://www.kk.org/thetechnium/archives/2009/07/was_moores_law.php"> more recent gains in storage capacity and network bandwidth</a>, this has produced a world awash in data, <a href='http://blog.palantirtech.com/2008/03/18/why-hal-varian-thinks-palantir-is-a-great-idea/'>crying out for analysis.</a></p>
<p>So the situation today is that we now expect to bring these considerable computing resources to bear on larger, more complex problems in the world.  I&#8217;m talking about things like the <a href="http://www.palantirtech.com/government/analysis-blog/traceback">spread of food-borne illnesses</a>, understanding the connection between genes and protein expression, <a href="http://www.palantirtech.com/government/analysis-blog/sinjar">understanding terrorist networks</a>, <a href="http://www.palantirtech.com/government/analysis-blog/uncovering-a-bot-net-exploring-router-data-using-palantir">finding botnets in network traffic logs</a>, and <a href="http://www.palantirtech.com/government/analysis-blog/transparency">exploring influence networks in government</a>.</p>
<p>These problems, while spanning a widely disparate areas of analysis, share some common traits:</p>
<h3>The data is spread out</h3>
<p>They are described by multiple data sources. Just to make things more interesting: the data sources don&#8217;t agree on their native representations of the real-world data. And finally, the real-world objects that the data are describing are actually described in multiple data sources, with no single source giving a complete and accurate representation.</p>
<h3>The data schema are not human-conceptual</h3>
<p>Rather than representing the data in some schema that maps easily into how the experts on a given problem think about said problem, the data stores in question tend to model data in whatever way was convenient for the creators of that particular data store. Put another way: people don&#8217;t think in tables, rows, columns, and XML snippets.  These first-class data storage elements don&#8217;t usually map to real-world objects.</p>
<h3>The data is sensitive</h3>
<p>Whether it&#8217;s patient information, <a href="http://www.palantirtech.com/government/analysis-blog/horizon">mortgage data</a>, a law enforcement investigation, or sensitive foreign intelligence, there is often the need for <a href="http://www.palantirtech.com/government/analysis-blog/mls">foolproof access controls on the data</a>.</p>
<h2>Palantir: an operating system-class abstraction for analysis</h2>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'><img src='http://blog.palantirtech.com/wp-content/uploads/2009/01/shot0016.png' width='250'/></div>
<p>A Palantir data server provides a similar class of services that an operating system does but focused on the specific needs of analytic tasks.  Here I&#8217;ll focus on the model used by Palantir Government; Palantir Finance uses a similar but significantly different approach to delivering these services.</p>
<p>As you might imagine, however, they both start at a somewhat higher level than punch cards.</p>
<h3>It starts with an ontology</h3>
<p>The Palantir approach to analysis begins with a task-specific ontology: essentially, a human-conceptual description of the real-world problem that&#8217;s being analyzed.</p>
<p>It&#8217;s roughly composed of three pieces:</p>
<ul>
<li>A hierarchical type system of the real-world objects that human experts use to think about this problem. We call these <i>PTObjects</i>, short for &#8220;Palantir Objects&#8221;.</li>
<li>A type system of properties that will contain the data describing these PTObjects.  PTObjects are essentially typed containers for properties. This is where most of the detail of the ontology lies.</li>
<li>A type system of possible relationships between different types of PTObjects.</li>
</ul>
<p>Within the ontology, there are numerous extension points that allow the customization of how data is imported, retrieved, and displayed (following the principle of <i>extending the abstraction in all directions</i>).</p>
<p>The data server takes the ontology as input and is agnostic to its content. This is where the principle of <i>building a general purpose framework</i> comes into play.</p>
<h3>The data sources are mapped into the ontology</h3>
<p>This part of the Palantir data server is a pattern that is very similar to an operating system&#8217;s notion of block device drivers. The difference? Instead of low-level storage systems like hard drives, we&#8217;re dealing with complex databases describing the problem at hand.</p>
<p>In an operating system, every block device can read and write blocks of data.  In the Palantir data server, everything becomes a source of PTObjects.</p>
<p>Our data importer plugins, by analogy,  fulfills the same role as a block device driver:<br />
we build glue code to map the data source&#8217;s schema into the ontology and the connectors to surface the data itself wrapped up in PTObjects.</p>
<h3>The data are composed into real-world objects.</h3>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='/wp-content/uploads/2009/11/pg-object-model.jpg'><img src='/wp-content/uploads/2009/11/pg-object-model.jpg' width='250'/></a>
</div>
<p>Part of this mapping is composing real-world objects into composite PTObjects by resolving PTObjects together.</p>
<p>The operation of resolving is pretty straightforward: we basically union the properties of the two PTObjects into a new PTObject. The end result is a single PTObject that completely represents all the data about something in the real-world from all the available data sources.</p>
<p>As we do this composition, we keep track of where each property came from, down to the record level, in each of its original sources.  (Note that most composed PTObjects will usually have at least one property that comes from two sources).  By preserving the original identity of every atom of data, it allows us to later decompose these PTObjects into their constituent parts or, more importantly, censor a client&#8217;s view based what permissions they have for each of the original data sources.</p>
<p>This a fundamental operation in our system that doesn&#8217;t have an exact analog in operating systems &#8212; it&#8217;s sort of similar to taking  multiple filesystems and mounting them inside a virtual filesystem tree, like Unix does.  However, if each data source is like a filesystem, what we&#8217;re doing is essentially composing individual files from their fragments stored on multiple block devices.</p>
<p>Another analogy: at a level below the block device in the OS, this is also sort of similar to what a <a href="http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0">RAID0</a> device does, the difference being that our composition is based on the contents of the data itself rather than some previously applied, content-agnostic, decomposition function.  The other difference being motivation: a RAID0 does it for performance, while Palantir is composing data to make it correspond to the real-world objects it represents.</p>
<h3>The server exposes Palantir &#8220;system calls&#8221;</h3>
<p>The interface that the Palantir data server exposes can be boiled down to two essential operations:</p>
<ul>
<li>The client can download copies of PTObjects from the server.  It may request them by id or perform some sort of search/query to specify a set of PTObjects.  This is roughly analogous to the <b><a href="http://en.wikipedia.org/wiki/Open_%28system_call%29">open()</a></b> and <b><a href="http://comsci.liu.edu/~murali/unix/read.htm">read()</a></b> system calls on Unix.
<p>Note that each client only sees the subset of properties for a given PTObject that it is authenticated for.  This censorship of full PTObjects into projected slices is something done by the server on every load of PTObjects.</li>
<li>The client can send new or updated PTObjects to the data server for storage. This is roughly analogous to the <b><a href="http://www.freebsd.org/cgi/man.cgi?query=write&#038;sektion=2&#038;manpath=FreeBSD+7.2-RELEASE">write()</a></b> system call in Unix. It, of course, entails a check as to whether the given client has permission to write to the given PTObject.</li>
</ul>
<p>The server&#8217;s responsibility is the same as the operating system: only let the client do what it has been granted permission to do.  In an operating system, the OS uses hardware features like <a href="http://en.wikipedia.org/wiki/Protected_mode">protected mode</a> to keep lower-privileged processes from accessing machine resources. Palantir uses network calls to achieve the same separation, by placing the client and server on different logical machines.  The effect is the same: the client basically requests (rather than commands) that certain operations are performed by the server.  The server uses its own rules to decide if the access or change is allowed and responds accordingly. And so the principle of <i>hard boundaries</i> is implemented.</p>
<h3>The clients do the analysis</h3>
<p>When an operating system yields to a process, that&#8217;s the time when the true processing begins.  By the same token, in Palantir, it&#8217;s not until a client connects and starts searching, visualizing, and manipulating PTObjects that analysis actually starts taking place (even if the server is doing a lot of the heavy lifting).</p>
<h2>The wide open future</h2>
<p>So why is this exciting?  I&#8217;m glad you asked!</p>
<h3>It&#8217;s about taking analysis to the next level.</h3>
<p>Let&#8217;s say you&#8217;re someone who wants to write an analytic task. Let me ask you a series of rhetorical questions:</p>
<ul>
<li>Do you want to start with three disparate sources of data or with the data already mapped into a Palantir data server?</li>
<li>Which one is a better use of your time as a programmer?</li>
<li>Which one allows you to not repeat mistakes that other programmers have already made and fixed?</li>
<li>Which one is more like writing a program than an operating system?</li>
</ul>
<p>Operating systems took us to a new level of expressiveness when it came to writing computing processes to run on computing hardware. It inverted that 85/15 ratio that Licklider talked about so that programmers spent more time writing the code that did the thing they were trying to create and less time mucking around with hardware.</p>
<p>More programmer time == better analytic tasks.</p>
<h3>It&#8217;s about making machine learning easier.</h3>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/Skynet_%28Terminator%29'><img src='http://images1.wikia.nocookie.net/terminator/images/8/8a/Cyberdyne_logo.jpg' width='250'/></a>
</div>
<p>Now consider machine learning as a field.  Pretty much every machine learning task could benefit from starting with its data in something that looks like a Palantir data server.  I&#8217;ve taken an informal survey of machine learning researchers and they agree: the 85/15 ratio still holds for machine learning.</p>
<p>Simply put: <b>most of the time and effort in machine learning is spent getting the data into a form that you can actually apply an algorithm to!</b> Now imagine if the starting point for that was a Palantir data server &mdash; now the machine learning implementer has a world of expressiveness open to them and time and energy are spent on the task at hand instead of the overhead of messing with the data.</p>
<p>Now, we don&#8217;t think that we&#8217;re building Skynet.  Quite the contrary: we believe that platforms like the one we&#8217;ve built will allow machine learning techniques to be put in the hands of experts to augment their ability to look at the world come to conclusions about complex real-world problems by asking questions of the data we&#8217;ve collected. It&#8217;s about <a href="http://en.wikipedia.org/wiki/Intelligence_amplification">Intelligence Augmentation</a>, which can use machine learning techniques and algorithms to build better tools, not creating <a href="http://en.wikipedia.org/wiki/Strong_AI">Strong AI</a>.</p>
<h3>It&#8217;s about creating new markets</h3>
<p>Let&#8217;s go back to the well of operating systems and look back at the history of MS-DOS: the first &#8220;killer&#8221; application on MS-DOS was <a href="http://en.wikipedia.org/wiki/VisiCalc">VisiCalc</a> (that screenshot at the top of this post), a text-based spreadsheet.  As you know, VisiCalc was not the end of the story but just the introduction. MS-DOS, evolved into Windows, allowed application writers an (arguably) clean abstraction on top of commodity hardware in order to build the applications that users actually wanted. Today, we have things like web browsers, multimedia authoring software, virtual machines, and IDEs built on top of what is, essentially, the same set of abstractions that VisiCalc was built on.</p>
<p>However, the most important thing to note is that VisiCalc is credited with creating the market for commercial operating systems &#8212; businesses needed VisiCalc so they paid Microsoft for MS-DOS (and IBM for a PC).  Without VisiCalc, there was no market for MS-DOS (most people, unsurprisingly, didn&#8217;t want to buy a <a href="http://en.wikipedia.org/wiki/Microsoft_BASIC">BASIC interpreter</a>).</p>
<p>We&#8217;re in the business of selling software and we agree with our customers: the Palantir approach has tremendous value.  We&#8217;ve just started tapping the potential of this market.  Think about what Oracle looked like in 1979, think what Microsoft looked like in 1980 &mdash; that&#8217;s Palantir in 2009.</p>
<h3>It&#8217;s about the start of the analysis age</h3>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/Information_Age'><img src='http://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/Internet_map_1024.jpg/600px-Internet_map_1024.jpg' width='250'/></a>
</div>
<p>It can be argued that the operating system is the innovation that ushered in the &#8220;<a href="http://en.wikipedia.org/wiki/Information_Age">information age</a>&#8220;.  Without the operating system, there is no software explosion, which allows computing technology to actually be used on data in the world.</p>
<p>We think that we&#8217;re on the cusp of the analysis age, as imagined by <a href="http://en.wikipedia.org/wiki/Vernor_Vinge">Vernor Vinge</a> in <u><a href="http://books.google.com/books?id=SrLwPdBJodMC&#038;dq=rainbow%27s+end&#038;printsec=frontcover&#038;source=bn&#038;hl=en&#038;ei=TdX0Sui9HsTh8AbGlc3zCQ&#038;sa=X&#038;oi=book_result&#038;ct=result&#038;resnum=5&#038;ved=0CBsQ6AEwBA#v=onepage&#038;q=&#038;f=false">Rainbow&#8217;s End</a></u>.  It was something foreseen by Licklider in 1960, albeit with a timeline that was off by at least a few decades:</p>
<blockquote><p>
“…it seems worthwhile to avoid argument with (other) enthusiasts for artificial intelligence by conceding dominance in the distant future of cerebration to machines alone. There will nevertheless be a fairly long interim during which the main intellectual advances will be made by men and computers working together in intimate association. A multidisciplinary study group, examining future research and development problems of the Air Force, estimated that it would be 1980 before developments in artificial intelligence make it possible for machines alone to do much thinking or problem solving of military significance. That would leave, say, five years to develop man-computer symbiosis and 15 years to use it. The 15 may be 10 or 500, but those years should be intellectually the most creative and exciting in the history of mankind.”
</p></blockquote>
<p>It&#8217;s a golden age of analysis and we&#8217;re just getting started: we&#8217;ve got a lot of work to do, so if this sort of thing excites you, please <a href='http://www.palantirtech.com/careers/culture'>come and join us.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Palantir: search with a twist (part two: realtime indexing and security)</title>
		<link>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/</link>
		<comments>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 07:01:01 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1260</guid>
		<description><![CDATA[[A number of weeks ago, we published a post on the search technology used by Palantir. That post covered raising the memory efficiency of a couple of operations. This is part two of that series.] The most familiar use of search engines is to index documents made available on the Internet via the hypertext transfer [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px'><img src='/wp-content/uploads/2009/08/200px-magnifying_glass_icon.png' alt='magnifying glass'/></div>
<p><em>[A number of weeks ago, we published <a href="http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/">a post on the search technology</a> used by Palantir.  That post covered raising the memory efficiency of a couple of operations.  This is part two of that series.]</em></p>
<p>The most familiar use of search engines is to index documents made available on the Internet via the <a href="http://www.ietf.org/rfc/rfc2616.txt">hypertext transfer protocol</a>. Forgotten names like <a href="http://en.wikipedia.org/wiki/AltaVista">AltaVista</a>, names not-yet-really-learned like <a href="http://web.archive.org/web/20040828134017/http://www.bing.com/">Bing</a>, and, of course, <a href="http://infolab.stanford.edu/~backrub/google.html">Google</a> come to mind.</p>
<p>This one, massive use case has a couple of properties that I&#8217;d like to highlight:</p>
<ul>
<li>Asynchronous indexing and querying &#8211; web search engines tend to use crawlers and indexers to build up an index of the web.  After each crawl is finished, the new index is brought online for use by the query engine.</li>
<li>Lack of access controls &#8211; all the data in the index is available to any query.  In fact, most queries are (from the standpoint of the index) completely anonymous.</li>
</ul>
<h3>Palantir: not a web search engine</h3>
<p>Search technology is just one part of what makes up a Palantir system.  For us, it&#8217;s a way to quickly retrieve Palantir objects in a Palantir system, it&#8217;s not the whole of the application.</p>
<p>I&#8217;d like to highlight a couple of differences from the <a href="http://en.wikipedia.org/wiki/Web_search_engine">web search engine</a> case.  A Palantir system needs the following properties:</p>
<ul>
<li>Realtime indexing and querying &#8211; we need information to be available immediately as it changes in the system.</li>
<li>Leak-proof access controls &#8211; we need the search engine to help us make sure that we don&#8217;t have information leaking across access control boundaries.</li>
</ul>
<p>Hit the link to read more about these topics.<br />
<span id="more-1260"></span></p>
<h2>Realtime indexing</h2>
<p>The Palantir platforms implement realtime indexing: as soon as an analyst changes an object in the system, it needs to be available to query. This could be a change to data in the object or a change to the security tags on the object.</p>
<p>From a programming perspective, this is pretty straightforward: a Palantir transaction will not commit until the search engine is finished indexing the new data.</p>
<p>From a search engine operational perspective, this induces some challenges.  Asynchronous indexing allows the search engines to bring online a highly optimized static form of the index.  Contrast that with realtime indexing, where every cycle spent optimizing the index is removing cycles from serving other queries and there is likely a human waiting for the optimizing process to finish.</p>
<p>When using the static index, a query only accesses one, optimized index file which then points to the documents containing the results.  However, as changes and additions are indexed into the system, there is a lot of overhead to merging them into the master index.</p>
<p>Instead of merging and optimizing on every change, Lucene can keep around a number of smaller indexes that hold all the fresh entries.  These are fixed-size append-only segments that are much cheaper to write to than the optimized and merged form of the index. So basically, these &#8216;dynamic&#8217; indexes are linear lists of single-document indexes.  When the search engine goes to run a query it has to follow this simple (yet expensive) algorithm:</p>
<ul>
<li>Query the static, merged index, accumulating results. <i>(this part is reasonably fast)</i></li>
<li>For each of the dynamic indexes:
<ul>
<li>Open the file, incurring IO overhead.</li>
<li>Query each single-document index and look for additional records or newer records that supersede one of the existing found results.</li>
</ul>
</li>
</ul>
<p>You can see how the overhead of this can quickly get pretty large as the number of dynamic indexes grows: it grows linearly with number of new indexed records.  Compare that with the optimized index, which should be close to constant time for any given query.</p>
<p>To get around this, the indexer will only allow a certain number of these dynamic indexes to accumulate before it kicks off a background merge job.  During the merge job, we take a noticeable performance hit, but by batching up the merge run we amortize the overhead away for an overall performance win.  This hybrid mode didn&#8217;t require us to write any new code, but just to tune Lucene to give us the performance profile we wanted.</p>
<h2>Preventing Information Leaks</h2>
<p>The Palantir data platform has a fairly sophisticated security model baked in (see <em><a href="http://www.palantirtech.com/government/videos/whitevideos">The White Videos</em></a> for a more in-depth look at the security model).  One of the features that we have implemented is the ability to show a narrower view of an object based the user&#8217;s permissions: the user only sees the slice of the data that they have been granted access to.  Part of the complexity in implementing this was that we can&#8217;t even hint that the other, hidden data exists at all.</p>
<p>Search engines ranks their results by relevance, showing the matches to the query that it believes to be most relevant first.  One common way to make these relevance calculations is by comparing the length of the search term or phrase to the length of the term that it matched.  Consider the search term &#8216;king&#8217;: it will match the following phrases:</p>
<ul>
<li>&#8220;I&#8217;m the king of the world!&#8221;</li>
<li>&#8220;King salmon are often found in the Pacific Northwest and are also known as Chinook salmon.&#8221;</li>
<li>&#8220;Yes, my king.&#8221;</li>
</ul>
<p>Using a length-computed relevance, the phrase, &#8220;Yes, my king.&#8221; is the most relevant.</p>
<p>Getting back to the Palantir object model: for each distinct set of permissions that an object has, we compute a different object label based on the properties that are visible to that particular slice.  These multiple titles all go into the search engine.  If we were to compute relevance based on the length of the phrase that matched, and the shortest match on the object is shorter than the match that is actually visible to us, we could return the object with a higher-than-obvious relevance.  If we were to do that, we&#8217;d be leaking information, namely that there&#8217;s data on this object that the user making the query is not privy to. (Note that filtering of objects that aren&#8217;t at all visible to the user is done in a higher layer  after the results have been accumulated and ranked by the search engine.)</p>
<p>Given this problem, there are two approaches one can take:</p>
<ol>
<li>Store all the information needed to decide which labels are visible to the user running the query and then use only the visible labels when calculating the relevance of a match. Note that is a pretty expensive operation.</li>
<li>Don&#8217;t use the length of match to compute relevance. Note that skipping a relevance calculation is, obviously, a very cheap thing do.</li>
</ol>
<p>Which do we do?  Both.</p>
<p>When matching against object labels, the length metric actually lets us discern between better and worse matches. So in that case, we incur the cost of this calculation in order to return higher quality results.</p>
<p>However, when matching against things like document bodies, the ratio of the size of the match to the size of the search term starts to have less meaning but still has the possibility of leaking information in the query results.  For fields like this, we turn off the relevance calculations based on length of match. The upshot is the we don&#8217;t have to store the permissions information in the index nor incur the cost of the permissions/views calculation for these fields.</p>
<h2>A heartfelt thank you</h2>
<p>To be clear, this post highlights the ways in which our search code diverges from the main <a href="http://lucene.apache.org/java/docs/">Lucene</a> code base.  We&#8217;re huge fans of Lucene and have great respect for the developers that built and maintain what is probably the world&#8217;s greatest open-source search engine.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>VizWeek 2009: Awards and Workflow</title>
		<link>http://blog.palantirtech.com/2009/08/24/vast09award/</link>
		<comments>http://blog.palantirtech.com/2009/08/24/vast09award/#comments</comments>
		<pubDate>Mon, 24 Aug 2009 11:00:01 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[user interface]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1157</guid>
		<description><![CDATA[We put up a post last year on the 2008 VAST Grand Challenge. Well, the IEEE VAST Challenge 2009 is over and the awards are in. We had another strong year, scoring two awards: Grand Challenge: Analyst’s Tool Choice (Of 48 submissions, only 3 Grand Challenge awards were given) Intuitive Traffic Visualization and Video Description [...]]]></description>
			<content:encoded><![CDATA[<div style="float: right; margin-left: 15px; margin-bottom: 15px;"><img src="/wp-content/uploads/2009/08/vast2009.jpg" alt="" /></div>
<p>We put up a post last year on the <a href="http://blog.palantirtech.com/2008/07/21/we-bring-data-to-life/">2008 VAST Grand Challenge</a>.  Well, the <a href="http://hcil.cs.umd.edu/localphp/hcil/vast/index.php">IEEE VAST Challenge 2009</a> is over and the awards are in.  We had another strong year, scoring two awards:</p>
<ul>
<li><strong>Grand Challenge: Analyst’s Tool Choice </strong>(Of 48 submissions, only 3 Grand Challenge awards were given)</li>
<li><strong>Intuitive Traffic Visualization and Video Description of the Analysis Process</strong></li>
</ul>
<p>Some background on the event: three years ago, the <a href="http://ieee.org/portal/site">IEEE</a> began an annual conference called <strong>VAST</strong> (<strong>V</strong>isual <strong>A</strong>nalytics in <strong>S</strong>cience and <strong>T</strong>echnology).  The <a href="http://vis.computer.org/VisWeek2009/vast/index.html">VAST symposium</a> focuses on the fundamental research contributions and  real-world application of <a href="http://www.infovis-wiki.net/index.php/Visual_Analytics">visual analytics</a>.  As a part of the conference, the <a href="http://hcil.cs.umd.edu/localphp/hcil/vast/index.php">VAST Challenge</a> allows teams to compete on delivering analytic solutions against a synthetic real-world dataset.</p>
<p>A selection of choice quotes from the judges:</p>
<ul>
<li><em>An award for “highly usable integrated exploration environment”, “efficient analytic exploration platform” or something along these lines would be appropriate.</em></li>
<li><em>Survey Question: How much novelty do you see in this submission (data processing, visualization, interaction, hypothesis generation or evaluation, overall process, etc.)? Answer: More so than novelty was the <span style="text-decoration: underline;">extremely</span> efficient solution approach to this challenge, much more so than other solutions.</em></li>
<li><em>The submission shows two things very clearly: One, it shows the analytical process as being a multi-faceted, simultaneous processing of different information that is quite common among analysts. Two, it shows how multiple perspectives can be displayed on a single monitor, enabling the analyst to visualize what his mind is analyzing. Outstanding!</em></li>
</ul>
<h2>Our submission</h2>
<p>And finally, our submission to the Grand Challenge.  Here we have our overview video, with a link to the full video below:</p>
<div style="text-align: center;">
<p><object id="banner" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="480" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="quality" value="high" /><param name="bgcolor" value="#000000" /><param name="src" value="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/VAST2009/GC_Intro.flv" /><param name="allowfullscreen" value="true" /><embed id="banner" type="application/x-shockwave-flash" width="640" height="480" src="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/VAST2009/GC_Intro.flv" allowfullscreen="true" bgcolor="#000000" quality="high"></embed></object></p>
</div>
<p>For an in-depth look at the data and techniques used to make this a reality, check out our full submission in <a href="http://www.palantirtech.com/government/analysis-blog/cyber-counter-intelligence"><em>Finding a Mole: Cyber Counter Intelligence</em></a> on the <a href="http://www.palantirtech.com/government/analysis-blog">Palantir Analysis Blog</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/08/24/vast09award/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bandwidth isn’t cheap. Disk isn’t cheap. CPU isn’t cheap.</title>
		<link>http://blog.palantirtech.com/2009/05/22/bandwidth-isnt-cheap-disk-isnt-cheap-cpu-isnt-cheap/</link>
		<comments>http://blog.palantirtech.com/2009/05/22/bandwidth-isnt-cheap-disk-isnt-cheap-cpu-isnt-cheap/#comments</comments>
		<pubDate>Sat, 23 May 2009 01:00:26 +0000</pubDate>
		<dc:creator>Bob McGrew</dc:creator>
				<category><![CDATA[distributed systems]]></category>
		<category><![CDATA[enterprise software]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[problemspace-government]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=961</guid>
		<description><![CDATA[At Palantir, we work in Silicon Valley, read High Scalability, and think of web companies like Facebook and Google as our peers. Most of the time, this is exactly the right recipe for bringing disruptive innovation into the intelligence community. Sometimes, though, it’s misleading – when discussing a design decision, it’s received knowledge that &#8220;Disk [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px'><img src='/wp-content/uploads/2009/05/ctu-clearance.jpg' alt='fake clearance screen'/></div>
<p>At Palantir, we work in Silicon Valley, read <a href="http://highscalability.com/">High Scalability</a>, and think of web companies like Facebook and Google as our peers. Most of the time, this is exactly the right recipe for bringing disruptive innovation into the intelligence community. Sometimes, though, it’s misleading – when discussing a design decision, it’s received knowledge that &#8220;Disk is cheap.&#8221; or &#8220;CPU is cheap&#8221;. For a web company with a deployment in a commercial data center (or its own data center), this received knowledge is correct.  But for a company that ships distributed systems instead of hosting them, and for whom the deployment environment is the kind of locked-down server room in which classified data can reside, these assumptions couldn’t be more false.</p>
<p>At Palantir, we are almost never able to host our customers’ data – typically, as the data is very sensitive, we are not even allowed to see it!  Our customers&#8217; highly sensitive data has to reside in a <a href='http://en.wikipedia.org/wiki/Sensitive_Compartmented_Information_Facility'>Secure Compartmented Information Facility</a> or SCIF – a building which has been built to be resistant to attempts to access the information within, whether through active or passive measures.  The network inside a SCIF is physically separated – “airgapped” &#8211; from the public Internet to prevent information leakage.  As the entire rationale for such facilities is to prevent information leakage, moving information into or out of one is a tightly regulated process, almost always requiring a human to be in the loop.<br />
<span id="more-961"></span></p>
<h3>Bandwidth is narrow</h3>
<p>Bandwidth in and out of a data center is cheap. Bandwidth in and out of a SCIF is not &#8211; and this manifests in surprising ways. First off, what does it take to get data into a SCIF? First, the data has to be downloaded from wherever it&#8217;s hosted and burned to a CD. Then, someone has to carry it into the SCIF and find a security officer to approve adding it to the network. Finding the security officer can take anywhere from 10 minutes to an entire day. Once you&#8217;ve found the security officer, he has to run a virus scan on the CD, which can run at a rate of roughly 20 minutes per 100MB.</p>
<p>If you look at the entire process, you can model our connection into the SCIF as averaging about an 8 hour latency and 640 Kbps bandwidth. That&#8217;s about the bandwidth of a slow DSL line and the latency of a radio connection to Pluto. (Actually, it’s somewhat slower.) There&#8217;s also a big non-linearity at 700MB, which is the amount of data that fits on a single CD.  For instance, this non-linearity is the big reason why we prefer to send patches to our customers rather than full distributions, which are slightly less than a gigabyte including dependencies – and thus why it’s worth it to us to build a system for automating patch application rather than simply replacing jar files by hand.</p>
<h3>Disks are expensive</h3>
<p>Similarly, if you are running a data warehouse, disk is cheap. You can buy a 1 TB, 7200 RPM disk for about $100, which is perfect for the kind of large, serial reads or writes that a data warehousing workflow requires. However, Palantir uses disk for our database and our search engine, both of which have an <a href='http://en.wikipedia.org/wiki/OLTP'>OLTP</a>-style usage pattern.  As opposed to a data warehouse access pattern, which emphasizes full table scans, OLTP emphasizes random access and therefore requires fast disk. To get 1TB at 15k RPMs costs about $1000, and requires a disk array rather than a single disk. In order to keep the disk fast, you also want to leave it only about 20% full, which overall makes fast disk about 50 times more expensive than slow disk. Most importantly, however, installing a disk array requires trained personnel, a special approval process, and reconfiguring the system to use the new disks, which is a fairly complicated and error-prone process.</p>
<h3>CPUs are hot</h3>
<p>Finally, in a commercial data center, CPU is the cheapest resource of all. In a secure server room, however, it can be quite expensive. Each CPU or additional box requires more power and cooling. If the room is nearly full, adding that extra box may require building out an entirely new server room, which can cost months and hundreds of thousands of dollars just for an office building. Building a server room in a SCIF is much more expensive and prohibitively time-consuming.</p>
<h3>RAM to the rescue</h3>
<p>On the other hand, some things in a SCIF are comparatively cheap. We never use boxes with less than 32GB of memory, and, in fact, lots of sites use 128GB of memory. RAM requires negligible power and cooling, and compared to disk, it&#8217;s relatively simple to install. It&#8217;s also easy to reconfigure the setup to use the additional memory.</p>
<h3>The upshot</h3>
<p>The design guidelines that follow from this are simple: <b>build a system that is as autonomous as possible and scales down as well as it scales out</b>.</p>
<p>All these statistics are compiled from our day-to-day experiences in the office environment of a SCIF. Deploying to soldiers in the field makes the issues involved in deploying to a SCIF seem minor. Of course, that’s what makes what we do fun.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/05/22/bandwidth-isnt-cheap-disk-isnt-cheap-cpu-isnt-cheap/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Palantir Config Server: lining up the ducks</title>
		<link>http://blog.palantirtech.com/2009/03/06/palantir-config-server-lining-up-the-ducks/</link>
		<comments>http://blog.palantirtech.com/2009/03/06/palantir-config-server-lining-up-the-ducks/#comments</comments>
		<pubDate>Fri, 06 Mar 2009 10:00:57 +0000</pubDate>
		<dc:creator>Khan Tasinga</dc:creator>
				<category><![CDATA[distributed systems]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[problemspace-government]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=193</guid>
		<description><![CDATA[At Palantir, we build distributed software. When deployed at a customer site, our platform consists of several servers running on, and distributed across, a cluster of machines. When I first joined the company, deploying and managing our platform was tedious and time consuming. Need to install servers? One by one, login to the machines where [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px'><img src="/wp-content/uploads/2009/03/installpalantirservers.png" alt="" width="261" height="394" /></div>
<p>At Palantir, we build distributed software. When deployed at a customer site, our platform consists of several servers running on, and distributed across, a cluster of machines. When I first joined the company, deploying and managing our platform was tedious and time consuming. Need to install servers? One by one, login to the machines where they need to go, lay down their requisite files and manually configure them such that they can work together. Have to bring down a deployment for scheduled maintenance? One by one, and in the correct order, login to the machines where the servers reside and shut them down. Want to change the private keys and certificates used to secure communication between servers? Well, you get the point.</p>
<p>From a customer perspective, the complexity associated with the administration of distributed software represents a significant challenge. Not providing tools to help reduce that complexity impacted the overall usability of our platform. Furthermore, from a Palantir perspective, a non-trivial portion of our resources were being devoted to deploying and managing instances of our platform, both externally (by Forward Deployed Engineers working directly with our customers) and internally (by development, QA and support staff working to maintain and improve our product). Could we be more efficient? No doubt. Given our intense focus on customer satisfaction and the desire to grow / scale our business, action was necessary.</p>
<p>To see how we solved this problem, read on.<br />
<span id="more-193"></span></p>
<p>We stepped back a bit, taking time to reflect on our situation and understand the problem. Based upon our experience, what key areas would a solution need to address? We settled on the following:</p>
<ol>
<li><strong>Lifecycle management.</strong>
<ol>
<li>Ease initial deployment and upgrade.</li>
<li>Handle coordinated starting, stopping and restarting.</li>
</ol>
</li>
<li><strong>Configuration management.</strong>
<ol>
<li>Track which servers are installed on what machines.</li>
<li>Provide centralized management of server configuration information.</li>
</ol>
</li>
<li><strong>Automation.</strong>
<ol>
<li>Support encoding common management tasks based on best practices.</li>
</ol>
</li>
</ol>
<p>In addition to those three key areas, we also identified several important requirements. A couple that definitely warrant mention:</p>
<ol>
<li><strong>Security.</strong></li>
<li><strong>Extensibility.</strong></li>
</ol>
<p>After getting a good sense of what needed to be accomplished, we put effort into investigating if an existing solution would fit the bill. For a variety of reasons (i.e., available feature set, licensing constraints, etc.), we never found a good match. We did, however, come across several open source building blocks that could, when composed appropriately, combine to form the foundation of a homegrown solution. The Config Server was born.</p>
<h2>Architecture</h2>
<div class="postimg"><img src="/wp-content/uploads/2009/03/configserverarchitecture.png" alt="" width="650" /></div>
<p>The Config Server works with remote agents to enable centralized deployment management. The diagram presented above provides an overview of our management infrastructure. Below is a brief discussion of each key component of our architecture.</p>
<ul>
<li><strong>Agent</strong> &#8211; Agents are installed on every machine in a deployment. They are lightweight background processes that sit around waiting to execute commands submitted by the Config Server, interacting directly with the services installed on a given machine. Instead of implementing our own agent solution, we decided to leverage existing technology, the open source peer-to-peer <a rel="nofollow" href="http://staf.sourceforge.net/">Software Testing Automation Framework (STAF)</a>. From its homepage:<br />
<blockquote><p>The Software Testing Automation Framework (STAF) is an open source, multi-platform, multi-language framework designed around the idea of reusable components, called services (such as process invocation, resource management, logging, and monitoring). STAF removes the tedium of building an automation infrastructure, thus enabling you to focus on building your automation solution. The STAF framework provides the foundation upon which to build higher level solutions, and provides a pluggable approach supported across a large variety of platforms and languages.</p></blockquote>
<p>We added support for two-way SSL to STAF to enhance the security of our management infrastructure (specifically, to allow us to implement authorization based on self-signed certificates). But beyond that, no modification was necessary. STAF provides us with a robust solution for remote process invocation and file management, both absolutely essential for centralized deployment management.</li>
<li><strong>Agent Manager</strong> &#8211; The Agent Manager provides lifecycle and configuration management functionality for the agents in a deployment. It interacts with remote machines through SSH, using the open source <a rel="nofollow" href="http://www.trilead.com/Products/Trilead_SSH_for_Java/">Trilead SSH for Java</a> library.</li>
<li><strong>Config Registry</strong> &#8211; The Config Registry maintains and provides access to all of the information the Config Server has about a deployment. It consist of the following:
<ul>
<li><strong>Agent Registry</strong> &#8211; The Agent Registry contains information about all of the agents in a deployment.</li>
<li><strong>Service Registry</strong> &#8211; The Service Registry keeps track of all of the services in a deployment.</li>
<li><strong>Config Repository</strong> &#8211; The Config Repository is a central store for configurations of the agents and services in a deployment.</li>
<li><strong>Package Repository</strong> &#8211; The Package Repository holds all of the service packages that can be installed in a deployment.</li>
<li><strong>Plugin Repository</strong> &#8211; The Plugin Repository houses all of the plugins that are available for use in the Config Server. Plugins are used by the Security Manager, Service Manager and Task Manager.</li>
</ul>
</li>
<li><strong>Security Manager</strong> &#8211; We secure our servers and management infrastructure using public key cryptography. The Security Manager handles the generation and packaging of private keys and certificates. We perform private key and certificate generation using the <a rel="nofollow" href="http://www.bouncycastle.org/java.html">Bouncy Castle Crypto APIs for Java</a>. Packaging is taken care of by plugins in the Plugin Repository. For example, one plugin packages private keys and certificates into JKS files for use with Java, while another packages them into PEM files for use with OpenSSL.</li>
<li><strong>Service</strong> &#8211; Services represent the software installed on the machines in a deployment that drive our platform. They correspond to the servers we&#8217;ve built and the 3rd party offerings on which they depend (i.e., databases, entity extractors, etc.).</li>
<li><strong>Service Manager</strong> &#8211; The Service Manager interacts with agents to provide lifecycle and configuration management functionality for the services in a deployment. The actual mechanics of lifecycle and configuration management vary from to service to service. For example, starting service A might require invoking one script, while starting service B might require invoking another. For each type of service in a deployment, the Plugin Repository contains a corresponding plugin that embeds the necessary management logic. The Service Manager works with those plugins to get its job done.</li>
<li><strong>Task Manager</strong> &#8211; Managing a deployment requires performing tasks that go beyond lifecycle and configuration management for its constituent agents and services (i.e., log aggregation, database user creation, etc.). Such tasks are implemented as plugins. They make things happen by communicating with agents and / or directly with machines via SSH. The Task Manager interacts with the Plugin Manager to load tasks and coordinate their execution.</li>
</ul>
<h2>Functionality</h2>
<p>How did we do with respect to our stated needs?</p>
<ul>
<li><strong>Lifecycle management</strong> &#8211; The Agent Manager and Service Manager provide centralized lifecycle management. Initial deployment and upgrades, as well as starting, stopping and restarting servers, can all be handled directly through the Config Server.</li>
<li><strong>Configuration management</strong> &#8211; The Config Repository of the Config Server maintains information about deployments and provides centralized configuration management. The Agent Manager and Service Manager support the remote retrieval and application of agent and service configuration.</li>
<li><strong>Automation</strong> &#8211; The Config Server&#8217;s functionality is exposed via a clean and consistent Java API. Common management tasks can be automated by writing code against that API.</li>
</ul>
<p>And what about some of our more important requirements?</p>
<ul>
<li><strong>Security</strong> &#8211; All communication in our management infrastructure is secured using two-way SSL. A simple authorization mechanism, implemented using self-signed certificates, ensures that only the authorized entities (most notably, the Config Server), can execute commands through agents. Client access to the data maintained, and functionality exposed, by the Config Server requires password-based authorization.</li>
<li><strong>Extensibility</strong> &#8211; The Config Server can be extended to support new types of services and perform new tasks by implementing plugins and dropping them in the Plugin Repository.</li>
</ul>
<h2>Future</h2>
<p>In the space of a few months, we built the Config Server to address several key needs and requirements related to the management of our platform. Our work has already begun to pay dividends. Looking ahead, there are several things we would like to do:</p>
<ul>
<li>Add support for low-level system management and configuration related to our platform (i.e., user and group management, firewall configuration, etc.).</li>
<li>Implement multi-deployment management with support for features like staging, mirroring and migration.</li>
<li><a rel="nofollow" href="http://en.wikipedia.org/wiki/Autonomic_computing">Autonomic Computing</a>, integrating with our monitoring solution to implement platform self-management.</li>
</ul>
<p>While we&#8217;ve accomplished a fair amount, plenty of work remains. We look forward to enhancing our Config Server and its associated infrastructure as we strive to make our platform one that is not only powerful and a pleasure to use, but also easy to manage and maintain.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/03/06/palantir-config-server-lining-up-the-ducks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

