<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Palantir Technologies &#187; softwarephilosophy</title>
	<atom:link href="http:///category/softwarephilosophy/feed/" rel="self" type="application/rss+xml" />
	<link></link>
	<description>Articles from the Engineering Group at Palantir Technologies</description>
	<lastBuildDate>Wed, 14 Dec 2011 17:48:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>How to Rock a Systems Design Interview</title>
		<link>http://blog.palantirtech.com/2011/10/28/how-to-rock-a-systems-design-interview/</link>
		<comments>http://blog.palantirtech.com/2011/10/28/how-to-rock-a-systems-design-interview/#comments</comments>
		<pubDate>Fri, 28 Oct 2011 15:00:41 +0000</pubDate>
		<dc:creator>John Carrino</dc:creator>
				<category><![CDATA[development process]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[interviewing]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[softwarephilosophy]]></category>
		<category><![CDATA[tips and tricks]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1937</guid>
		<description><![CDATA[Comic courtesy of XKCD, via Creative Commons License Note: this third installment in our series on doing your best in interviews. Previously: &#8220;How to Rock an Algorithms Interview&#8221; and &#8220;The Coding Interview&#8221;. One interview that candidates often struggle with is the systems design interview. Even if you know your algorithms and write clean code, that [...]]]></description>
			<content:encoded><![CDATA[<div style='text-align: center'><a href='https://www.xkcd.com/754/'><img style='width: 100%' src='/wp-content/uploads/2011/10/dependencies.png' alt='Compiler design dependency comic, originally from http://www.xkcd.com/754/' title='Comic originally from http://www.xkcd.com/754/' /></a>
<div style='text-align: right; font-size: 0.6em; margin-bottom: 1em;'>Comic courtesy of <a href='http://www.xkcd.com/754/'>XKCD</a>, via Creative Commons License</div>
</div>
<p>
<span style='font-size: 0.7em'><em>Note: this third installment in our series on doing your best in interviews.  Previously: <a href="/2011/09/26/how-to-rock-an-algorithms-interview/" title="How to Rock an Algorithms Interview" target="_blank">&#8220;How to Rock an Algorithms Interview&#8221;</a> and <a href="/2011/10/03/the-coding-interview/" title="The Coding Interview" target="_blank">&#8220;The Coding Interview&#8221;</a>.</em></span>
</p>
<p>One interview that candidates often struggle with is the systems design interview. Even if you know your algorithms and write clean code, that code needs to run on a computer somewhere &mdash; and then things quickly get complicated. A truly unbelievable amount of complexity lies beneath something as simple as <a href="https://plus.google.com/112218872649456413744/posts/dfydM2Cnepe">visiting Google in your browser</a>. While most of that complexity is abstracted away from the end user, as a system designer you have to face it head on, and the more you can handle, the better.</p>
<p>At Palantir, many of our teams give a systems design interview along with an <a href="http://blog.palantir.com/2011/09/26/how-to-rock-an-algorithms-interview/">algorithms interview</a> and a couple of <a href="http://blog.palantir.com/2011/10/03/the-coding-interview/">coding interviews</a>. We don’t expect anyone to be an expert at all three disciplines (although some are). We’re looking for generalists with depth &mdash; people who are good at most things, and great at some. If systems design isn&#8217;t your strength, that’s okay, but you should at least be able to talk and reason competently about a complex system.</p>
<p>Read on to learn about what we&#8217;re looking for and how you can prepare.</p>
<p><span id="more-1937"></span></p>
<h2>We’re measuring three things</h2>
<p>Nominally, this interview appears to require knowledge of <strong>systems</strong> and a knack for <strong>design</strong> &mdash; and it does. What makes it interesting, though, and sets it apart from a coding or an algorithms interview, is that whatever solution you come up with during the interview is just a side effect. What we actually care about is the process. </p>
<p>In other words, the systems design interview is all about <strong>communication</strong>. </p>
<p>This reflects what actually working at Palantir is like. As engineers we have a tremendous amount of freedom. We aren’t asked to implement fully-specced features. Instead we take ownership of <em>open-ended problems</em>, and it’s our job to come up with the best solution to each. We need people we can trust to do the right thing without a lot of supervision &mdash; people who can own large projects and take them consistently in the right direction. Invariably, this means being able to communicate effectively with the people around you. Working on <a href="http://blog.palantir.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/">problems with huge scope</a> isn&#8217;t something you can do in a vacuum.</p>
<h2>It&#8217;s an open-ended conversation</h2>
<p>Usually we’ll start by asking you to design a system that performs a given task. The prompt will be simple, but don’t be fooled &mdash; these problems are wide and bottomless, and the point of the interview is to see how much volume you can cover in 45 minutes.</p>
<p>For the most part, you’ll be steering the conversation. It’s up to you to understand the problem. That might mean asking questions, sketching diagrams on the board, and bouncing ideas off your interviewer. Do you know the constraints? What kind of inputs does your system need to handle? You have to get a sense for the scope of the problem before you start exploring the space of possible solutions. And remember, there is no single right answer to a real-world problem. Everything is a tradeoff.</p>
<h2>Topics</h2>
<p>Systems are complex, and when you’re designing a system you’re grappling with its full complexity. Given this, there are many topics you should be familiar with, such as:</p>
<ul>
<li><b>Concurrency.</b> Do you understand threads, deadlock, and starvation? Do you know how to parallelize algorithms? Do you understand consistency and coherence?</li>
<li><b>Networking.</b> Do you roughly understand <a href='https://secure.wikimedia.org/wikipedia/en/wiki/Inter-process_communication'>IPC</a> and <a href='https://secure.wikimedia.org/wikipedia/en/wiki/Internet_Protocol_Suite'>TCP/IP</a>? Do you know the difference between throughput and latency, and when each is the relevant factor?</li>
<li><b>Abstraction.</b> You should understand the systems you’re building upon. Do you know roughly how an OS, file system, and database work? Do you know about the various levels of caching in a modern OS?</li>
<li><b>Real-World Performance.</b> You should be familiar with the <a href="http://everythingisdata.wordpress.com/2009/10/17/numbers-everyone-should-know/">speed of everything</a> your computer can do, including the relative performance of RAM, disk, SSD and your network.
<li><b>Estimation.</b> Estimation, especially in the form of a back-of-the-envelope calculation, is important because it helps you narrow down the list of possible solutions to only the ones that are feasible. Then you have only a few prototypes or micro-benchmarks to write.</li>
<li><b>Availability and Reliability.</b> Are you thinking about how things can fail, especially in a <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Fallacies_of_Distributed_Computing">distributed environment</a>? Do know how to design a system to cope with network failures? Do you understand durability?</li>
</ul>
<p>Remember, we&#8217;re not looking for mastery of all these topics. We&#8217;re looking for <em>familiarity</em>. We just want to make sure you have a good lay of the land, so you know which questions to ask and when to consult an expert.</p>
<h2>How to prepare</h2>
<p>How do you get better at something? If your answer isn’t along the lines of &#8220;practice&#8221; or &#8220;hard work,&#8221; then I have a bridge to sell you. Just like you have to write a lot of code to get better at coding and do a lot of drills to get really good at basketball, you’ll need practice to get better at design. Here are some activities that can help:</p>
<ul>
<li><strong>Do mock design sessions.</strong> Grab an empty room and a fellow engineer, and ask her to give you a design problem, preferably related to something she&#8217;s worked on. Don&#8217;t think of it as an interview &mdash; just try to come up with the best solution you can. Design interviews are similar to actual design sessions, so getting better at one will make you better at the other.</li>
<li><strong>Work on an actual system</strong>. Contribute to OSS or build something with a friend. Treat your class projects as more than just academic exercises &mdash; actually focus on the architecture and the tradeoffs behind each decision. As with most things, the best way to learn is by doing.</li>
<li><strong>Do back-of-the-envelope calculations for something you&#8217;re building and then write micro-benchmarks to verify them.</strong> If your micro-benchmarks don&#8217;t match your back-of-the-envelope numbers, some part of your mental model will have to give, and you&#8217;ll learn something in the process.</li>
<li><strong>Dig into the performance characteristics of an open source system.</strong>  For example, take a look at <a href="https://code.google.com/p/leveldb/">LevelDB</a>.  It&#8217;s new and clean and small and well-documented. Read about the <a href="http://leveldb.googlecode.com/svn/trunk/doc/impl.html">implementation</a> to understand how it stores its data on disk and how it compacts the data into levels. Ask yourself questions about tradeoffs: which kinds of data and sizes are optimal, and which degrade read/write performance? <em>(Hint: think about random vs. sequential writes.)</em>
<li><strong>Learn how databases and operating systems work</strong> under the hood. These technologies are not only tools in your belt, but also a great source of design inspiration. If you can  think like a DB or an OS and understand how each solves the problems it was designed to solve, you&#8217;ll be able to apply that mindset to other systems.</li>
</ul>
<h2>Final thought: relax and be creative</h2>
<p>The systems design interview can be difficult, but it&#8217;s also a place to be creative and to take joy in the imagining of systems unbuilt. If you listen carefully, make sure you fully understand the problem, and then take a clear, straightforward approach to communicating your ideas, you should do fine.</p>
<p>Good luck!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/10/28/how-to-rock-a-systems-design-interview/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A rigorous friction model for human-computer symbiosis</title>
		<link>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/</link>
		<comments>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 03:18:52 +0000</pubDate>
		<dc:creator>Asher Sinensky</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[javatech]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[softwarephilosophy]]></category>
		<category><![CDATA[user interface]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1344</guid>
		<description><![CDATA[This is a response to Ari&#8217;s awesome post on human-computer symbiosis. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible&#8230; let&#8217;s take a look: We are attempting to understand the total analytic capability for a given task a of a human-computer [...]]]></description>
			<content:encoded><![CDATA[<div style='text-align: center; float: right; margin-left: 15px; margin-right: 15px'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/graph.png" alt="" width="300"/>
</div>
<p>This is a response to <a href="http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/">Ari&#8217;s awesome post on human-computer symbiosis</a>. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible&#8230; let&#8217;s take a look:</p>
<p>We are attempting to understand the total analytic capability for a given task <strong><em>a</em></strong> of a human-computer team. Analytic capability in this case probably means:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq1.png" alt="eq1"/>(1)
</div>
<p>Where <strong><em>A</em></strong> is the answer to the analytic problem in question and <strong><em>t<sub>A</sub></em></strong> is the time needed to arrive at the answer based on the inputs available. In the case of chess, <strong><em>A</em></strong> could be the optimum next move given all previous information and <strong><em>t<sub>A</sub></em></strong> would be how long it takes to decide on this move.</p>
<p>Read on for a look at how this generalizes in human-computer symbiotic systems.<br />
<span id="more-1344"></span></p>
<p>In the case of the human-computer team, we know that <strong><em>a </em></strong>is going to be a function of both the human&#8217;s analytical capability <strong><em>h</em></strong> and the computer&#8217;s analytical capability <strong><em>c</em></strong> (where both <strong><em>h</em></strong> and <strong><em>c</em></strong> have units of answers/time). In the limit case we know that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq2.png" alt="eq2"/>(2)
</div>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq3.png" alt="eq3"/>(3)
</div>
<p>Or in plain English, if there is no human present, the total analytic capability is simply the analytic capability of the computer. So the naïve solution would be that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq4.png" alt="eq4"/>(4)
</div>
<p>(4) clearly meets the limiting cases described in (2) and (3). Kasparov noticed a mixing function where the ability of the human and computer to work together becomes the dominant term &mdash; we might call this the mixing capability for the given task or <strong><em>m</em></strong>. Including this phenomenon, the total analytic capability (4) would be re-defined as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq5.png" alt="eq5"/>(5)
</div>
<p>where <strong><em>m</em></strong> has the property that:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq6.png" alt="eq6"/>(6)
</div>
<p>Thus maintaining the limits expressed in (2) and (3) and adhering to the observation that if there is no human or computer component then there will be no mixing advantage. A naïve solution to this constraint would be simple linear mixing:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq7.png" alt="eq7"/>  (7)
</div>
<p>where <strong><em>M</em></strong> (units of time per answer) is the mixing efficiency and will be primarily based on the type of task being solved &mdash; some analytical tasks lend themselves to a combined process more than others (for example, multiplying 20 digit numbers does not really benefit from the intuition of a human so the ability of a human and computer to perform this task is merely their additive ability). </p>
<p>What Kasparov noticed is that the mixing was primarily based on the quality of the process rather than the analytical power of either the human or computer separately. This seems to imply that we must somehow account for the fact that the quality of the human-computer interface is responsible for the quality of the mixing. This can be modeled as a unitless friction of interaction <strong><em>f<sub>i</sub></em></strong> that impedes the ability of the human and computer to work together. </p>
<p>Equation (7) can thus be re-written as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq8.png" alt="eq8"/>(8)
</div>
<p>In this case, the maximum value for the mixing capability is realized when the friction of interaction goes to zero. This mixing capability is the same as the equation Ari developed (less the coefficient which is necessary to maintain consistent units throughout).</p>
<p>We can now re-write our analytic capability in (5) as:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq9.png" alt="eq9"/>(9)
</div>
<p>Below, see a plot of this function over a range of values for <strong><em>h</em></strong>, <strong><em>c</em></strong> and <strong><em>f<sub>i</sub></em></strong>:</p>
<div style='text-align: center; margin: auto; margin-bottom: 1em;'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/graph.png" alt=""/>
</div>
<p>As can clearly be seen from this functional plot (note the vertical scale), the effect of interface friction dominates over the other terms whenever both the human and computer can make important contributions to the task at hand. The conclusion can be drawn that the most effective way to solve analytical problems is to minimize the friction of the human-computer interface; or to put it another way: optimal analytical systems are those that are built specifically to maximize the ability of the human to leverage the ability of the computer.</p>
<p>I am certain there is still the possibility for further refinement, for example:</p>
<div style='text-align: center;margin-bottom: 1em'>
<img style='vertical-align: middle' src="/wp-content/uploads/2010/06/eq10a.png" alt="eq10a"/>(10)
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2010/06/02/a-rigorous-friction-model-for-human-computer-symbiosis/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Friction in Human-Computer Symbiosis: Kasparov on Chess</title>
		<link>http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/</link>
		<comments>http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/#comments</comments>
		<pubDate>Mon, 08 Mar 2010 19:32:06 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1302</guid>
		<description><![CDATA[As we build our platforms and applications following a human-computer symbiosis approach, we keep an ear to the ground for interesting examples that illuminate new techniques or validate our approach in some empirical way. One of the areas that we&#8217;re interested is in the overall friction of analysis systems. The systems that we build are [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px;'>
<img src='/wp-content/uploads/2010/03/fools-mate.gif'/>
</div>
<p>As we build our <a href="http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/">platforms</a> and <a href="http://blog.palantirtech.com/2009/09/29/the-palantir-technologies-demo-reel-screenshots-round-3/">applications</a> following a <a href="http://en.wikipedia.org/wiki/Intelligence_amplification">human-computer symbiosis</a> approach, we keep an ear to the ground for interesting examples that illuminate new techniques or validate our approach in some empirical way.</p>
<p>One of the areas that we&#8217;re interested is in the overall friction of analysis systems.  The systems that we build are built on commodity hardware &mdash; we&#8217;re not building faster computers and yet we can deliver orders-of-magnitude better performance on analysis tasks than existing solutions.  How do we do this?  By building software in such a way that it reduces the friction experienced at the boundaries between the computing power, the analyst,  and the source data.</p>
<h2>Chess as analysis laboratory</h2>
<p>Chess is, at its heart, a predictive venture.  The player attempts to anticipate their opponent&#8217;s moves, planning their own moves accordingly, with the straightforward goal of finding a sequence of piece moves that force checkmate. </p>
<p>This game is, in its ideal form, analysis. (The moves made are the logical extension of the analysis.)  The data are clean, the problem is well-defined and everyone plays by the same rules.  There are even <a href="http://en.wikipedia.org/wiki/Elo_rating_system">well-defined metrics for ranking chess players by skill</a> &mdash; a better chess player is a better chess-game analyst.  </p>
<p>In the realm of evaluation of analysis systems, this is as about as good as it gets in terms of designing controlled experiments to study the relative strengths of different analysis systems.</p>
<p><a href="http://en.wikipedia.org/wiki/Garry_Kasparov">Garry Kasparov</a>, widely considered to be the greatest chess player of all time,  recently wrote <a href="http://www.nybooks.com/articles/23592">a review of Diego Rasskin Gutman&#8217;s book</a>, <a href="http://www.amazon.com/Chess-Metaphors-Artificial-Intelligence-Human/dp/026218267X"><u>Chess Metaphors: Artificial Intelligence and the Human Mind</u>.</a></p>
<p>The review is excellent and covers a lot of ground.  However, one particular anecdote stood out as a very interesting example of human-computer symbiosis (emphasis added):</p>
<blockquote><p>In 2005, the online chess-playing site Playchess.com hosted what it called a &#8220;freestyle&#8221; chess tournament in which anyone could compete in teams with other players or computers. Normally, &#8220;anti-cheating&#8221; algorithms are employed by online sites to prevent, or at least discourage, players from cheating with computer assistance. (I wonder if these detection algorithms, which employ diagnostic analysis of moves and calculate probabilities, are any less &#8220;intelligent&#8221; than the playing programs they detect.)</p>
<p>Lured by the substantial prize money, several groups of strong grandmasters working with several computers at the same time entered the competition. At first, the results seemed predictable. The teams of human plus machine dominated even the strongest computers. The chess machine Hydra, which is a chess-specific supercomputer like Deep Blue, was no match for a strong human player using a relatively weak laptop. Human strategic guidance combined with the tactical acuity of a computer was overwhelming.</p>
<p>The surprise came at the conclusion of the event. <em>The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time.</em> Their skill at manipulating and &#8220;coaching&#8221; their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. <em>Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.</em></p></blockquote>
<p>After the jump, we look at this finding in a more generalized way and map it onto the Palantir approach.<br />
<span id="more-1302"></span></p>
<h2>The cyborg Grandmaster: a fearsome opponent</h2>
<p>The tournament Kasparov recalls was a showcase of chess talent, human-computer symbiosis, and raw computing power.  Among those entered  in the tournament were a purpose-made chess machine (similar to <a href="http://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)">Deep Blue</a>) named <a href="http://en.wikipedia.org/wiki/Hydra_(chess)">Hydra</a> and a team of <a href="http://en.wikipedia.org/wiki/Grandmaster_(chess)">Grandmasters</a> assisted by computer programs.</p>
<p>One losing participant had this to say about the computer-aided Grandmasters:</p>
<blockquote><p>
Secondly, I have learned that a <a href="http://en.wikipedia.org/wiki/Grandmaster_(chess)">Grandmaster</a> armed with a chess engine is a killer combination against a plain Engine. Engines see everything via brute force, Grandmasters use their intuition and are able to see &#8220;obvious&#8221; moves at once. So the two of them together are a mighty force.
</p></blockquote>
<p>This is just as Licklider predicted 50 years ago &#8212; quoting <a href="http://blog.palantirtech.com/man-computer-symbiosis/">Man-Computer Symbiosis</a> (if I could put it better, I would):</p>
<blockquote><p>
Men will set the goals and supply the motivations, of course, at least in the early years. They will formulate hypotheses. They will ask questions&#8230; In general, they will make approximate and fallible, but leading, contributions, and they will define criteria and serve as evaluators, judging the contributions of the equipment and guiding the general line of thought.</p>
<p>&#8230;</p>
<p>In addition, the computer will serve as a statistical-inference, decision-theory, or game-theory machine to make elementary evaluations of suggested courses of action whenever there is enough basis to support a formal statistical analysis. Finally, it will do as much diagnosis, pattern-matching, and relevance-recognizing as it profitably can, but it will accept a clearly secondary status in those areas.
</p></blockquote>
<p>So in classic intelligence amplification fashion, having computer programs that can quickly evaluate a move&#8217;s likelihood of success can <em>amplify the power of the Grandmaster</em>.</p>
<p>While empirically true, it does beg the question: how <em>much</em> does it amplify the power of the Grandmaster?</p>
<p>One approximation might be product as a simple linear amplification.  Let&#8217;s imagine a function, <em>a(h,c)</em>, in which the analytic power (<em>a</em>) is the product of power of the human (<em>h</em>) and the computing power of the chess engine being used (<em>c</em>).  This gives us the equation:</p>
<div style='text-align: center'>
<img src='/wp-content/uploads/2010/03/hcs-eq-simple.png'/>
</div>
<h2>One term to dominate them all: friction-of-interface</h2>
<p>Does this simple approximation hold up?  It does not. The team that won the <a href="http://www.chessbase.com/newsdetail.asp?newsid=2461">PAL/CSS Freestyle Tournament in 2005</a> was composed of two amateur chess players that were able to best a computer-assisted Grandmaster.</p>
<p>How did  they accomplish this feat?  It was not through superior compute power.  Instead, they did so by more effectively feeding insights to their three chess engines. They played so well that a large number of people actually assumed that it was actually Kasparov himself playing:</p>
<blockquote><p>
Many speculated that it might be Garry Kasparov, who was the initiator of this kind of computer assisted chess matches. When we asked him Kasparov confirmed that was not the case. But he reminded us that it doesn&#8217;t really matter. The guiding principle of Freestyle Chess: anything is allowed. &#8220;Even if they were assisted by the devil, that would probably be covered by the rules,&#8221; he joked. &#8220;Only the moves they played count.&#8221;
</p></blockquote>
<p>What does this mean for our simple equation? Well, it looks it&#8217;s missing a term, one we&#8217;ll call <em>f</em>, that describes the efficiency or <strong>friction</strong> of the interface between human and computer.</p>
<p>Quoting Kasparov again:</p>
<blockquote><p>
<em>Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.</em>
</p></blockquote>
<p>The implication being that the equation actually looks like this:</p>
<div style='text-align: center'>
<img src='/wp-content/uploads/2010/03/hcs-eq-variable-h.png'>
</div>
<p>So as the friction of the interface goes to zero, the full amplification of the chess engine is brought to bear.  A quick gut-check in the opposite direction agrees: one can imagine the world&#8217;s most powerful chess engine with the world&#8217;s worst interface; spending the time it would take to express commands to this theoretically awful program would actually be worse than playing without it.</p>
<h2>Palantir: a low-friction interface to data</h2>
<p>As analysis problems go, chess resembles <a href="http://en.wikipedia.org/wiki/Spherical_cow">a spherical cow in a vacuum</a>.  Analysis problems in the real world are orders of magnitude messier.</p>
<p>Let&#8217;s reframe the terms of our equation above into a more general approach to analysis:</p>
<ul>
<li><em>H</em> &#8211; this is power of the analyst.  In chess, the value of this terms varies widely between players; in designing real-world data analysis systems, this is more or less a constant (which is why <em>h</em> above becomes <em>H</em> below).  Of course there are differing levels of expertise, training, and raw ability amongst the user population, but when we design systems, it&#8217;s with the average case in mind.</li>
<li><em>c</em> &#8211; computing power. How fast are the machines?  How well do they scale?  How efficiently do they perform the data tasks at hand? Palantir spends significant engineering effort on optimizing the <em>c</em> term, but most of the growth in this term comes from the layers we depend on, built by companies like Intel, Sun, Oracle, etc.</li>
<li><em>f</em> &#8211; friction.  How easy is it to bring <em>c</em> to bear on the problem? Note that when we talk about <em>friction of interface</em>, this is not exclusively referring to user interface.  More generally, friction can be present at any interface between two systems: data-software, software-software, human-software, etc. The <em>f</em> that we consider in this simple model is sum total system friction.</li>
</ul>
<p>So our final formulation is just in terms of <em>c</em> and <em>f</em> (holding <em>H</em> as a constant): </p>
<div style='text-align: center'>
<img src='/wp-content/uploads/2010/03/hcs-eq-final.png'>
</div>
<p>When we discuss friction in real-world analysis systems, the friction actually exists at multiple levels:</p>
<ol>
<li>Creating an analysis model that will enable answering the questions that need to be explored</li>
<li>Integrating the data into a single coherent view of the problem</li>
<li>Enabling analysis tools to efficiently query and load the data</li>
<li>Exposing APIs that allow developers to develop custom solutions quickly and efficiently for modeling and analysis tasks not covered by general tools</li>
<li>User interface that makes the tools easy, enjoyable, and quick to use</li>
</ol>
<h3>Minimizing <em>f</em>: Haiti Flooding Predictions</h3>
<p>If this is starting to sound very similar to Palantir&#8217;s marketing information, this is no accident. While some of our backend engineers are concerned with things like scaling and speed-of-querying, the overall innovation that we&#8217;re bringing to the field is not simply about faster data processing systems (even if they are) but reducing the friction at every interface inside a complex human-computer symbiotic system.</p>
<p>You want an example that ties it all together?  It starts with a simple question: which of the many displaced-person camps in Haiti are most at risk for flooding as the rainy season approaches?  Easy to ask, but not so simple to answer. </p>
<p>The original introduction to this video: </p>
<blockquote><p>As we enter the beginning of the rainy season in Haiti, one of the biggest problems facing relief organizations today is the spectre of flooding and mudslides destroying Internally Displaced Persons (IDP) Camps. In this video, we integrate data from many sources to determine high risk aid locations.
</p></blockquote>
<p>The data integration for this video took about six hours, using sources of data that had never before been fused.  The analysis itself takes a few minutes and quickly comes to an actionable answer to the original question.</p>
<div style='text-align: center;'>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="480" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="id" value="banner" /><param name="quality" value="high" /><param name="bgcolor" value="#000000" /><param name="src" value="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv" /><embed src="http://www.palantirtech.com/_ptwp_live_ect0/wp-content/themes/ptcom/swf/fvp.swf?movieurl=http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="480" movieurl="http://media.palantirtech.com/government/videos/haiti/haiti_flooding.flv"/></object>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2010/03/08/friction-in-human-computer-symbiosis-kasparov-on-chess/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Palantir: like an operating system for data analysis</title>
		<link>http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/</link>
		<comments>http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/#comments</comments>
		<pubDate>Sat, 07 Nov 2009 03:21:44 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1198</guid>
		<description><![CDATA[If you&#8217;ve taken the time to peruse the Palantir Government analysis blog, you&#8217;ve seen numerous examples of Palantir Government as applied to interesting problems; they are recorded screen captures of our analysis desktop client. It&#8217;s a showcase of useful, meaningful, and compelling visual and semantic tools being used to do analysis on a wide range [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/VisiCalc'><img src='/wp-content/uploads/2009/11/visicalc.png' width='250'/></a>
</div>
<p>If you&#8217;ve taken the time to peruse the Palantir Government <a href='http://www.palantirtech.com/government/analysis-blog'>analysis blog</a>, you&#8217;ve seen numerous examples of Palantir Government as applied to interesting problems; they are recorded screen captures of our analysis desktop client.  It&#8217;s a showcase of useful, meaningful, and compelling visual and semantic tools being used to do analysis on a wide range of datasets.</p>
<p>What enabled this analysis? Aside from the <a href="http://blog.palantirtech.com/2009/09/29/the-palantir-technologies-demo-reel-screenshots-round-3/">obvious hard work of our UI and analysis tools teams</a>, it&#8217;s the flexibility and power of the Palantir data platform.  More than just a scalable datastore, the Palantir data platforms act as robust and clean abstractions on top of data.</p>
<p>One of the early architecture decisions that we made when building both <a href="http://www.palantirtech.com/government">Palantir Government</a> and <a href="http://www.palantirfinance.com/">Palantir Finance</a> was to separate the respective data platforms from the end-user applications used to actually perform analysis.  More than just following the client-server model, this separation made the data servers in both products into generic intelligence infrastructure for analytic problems, with our clients acting as analysis applications on top of those platforms.</p>
<p>And so, one way to look at our data platform is as an operating system for analytic applications.  In this post we&#8217;ll explore the history of operating systems, understand why they&#8217;re so important and see how the Palantir data servers deliver the same potential to revolutionize the writing of analysis software that operating systems did to the writing of general programs for computers.</p>
<p><span id="more-1198"></span></p>
<h2>The OS: abstraction that begat a paradigm</h2>
<p>In the early days of computing, when a programmer wanted to write a program, they had to understand the inner workings of the machine. Writing a program required understanding things like the bus interface of a specific model of hard drive when all that was needed by the program was the clean abstraction of a filesystem. The upshot of this is that much of the time and effort put into a given task was spent writing code to interface with the &#8220;physical&#8221; minutiae of the machine rather than implementing the solution to the problem that the programmer was trying to solve with their software.</p>
<p>This pattern was observed by  <a href="http://en.wikipedia.org/wiki/J._C._R._Licklider">J.R. Licklider</a> and noted in his influential paper, <i><a href="http://blog.palantirtech.com/man-computer-symbiosis/">Man-Computer Symbiosis</a></i> (emphasis added):</p>
<blockquote><p>
<b>About 85 per cent of my “thinking” time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it.</b> Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so.<br />
…<br />
<b>Throughout the period I examined, in short, my “thinking” time was devoted mainly to activities that were essentially clerical or mechanical</b>: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. <b>Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability.</b>
</p></blockquote>
<p>This description of his time as a researcher was echoed in the work of the early programmers: they spent much of their programming time re-inventing the wheel and writing routines that were doing essentially clerical or mechanistic work related to the functioning of the hardware rather the core functions of their programs.</p>
<p>The operating system changed all that: suddenly (and by that I mean: with years of hard work, research, and incremental change) that noisy, inconsistent pile of hardware was transformed into a set of clean abstractions. The programmer was finally freed to spend time and energy on the problem they were really trying to solve.</p>
<p>And so we come to the modern era: dealing with the messy details of hardware has been replaced by the clean and robust abstraction of the operating system.</p>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/Operating_system'><img src='/wp-content/uploads/2009/11/250px-operating_system_placementsvg.png' width='250'/></a>
</div>
<p>Three important properties of modern operating systems:</p>
<ul>
<li><b>Hard boundaries between OS functions and process functions</b> &#8211; in modern operating systems, this is usually accomplished with system calls.  The process places the inputs to the system call in a known location and then asks the OS to perform some operation, like writing to a file or making a network connection.  The OS may or may not perform the function, based on things like permissions, availability of resources, etc.
<p>The most important feature here is that the process never has direct access to the true resources of the machine &mdash; instead, all access to the machine&#8217;s resources are brokered by the OS.
</li>
<li><b>Extensions of the abstraction in every direction</b> &#8211; An OS like Linux is really, at its core, a kernel that does process scheduling and lifecycle, manages memory, and services system calls. Everything else is handled by some sort of driver.  A driver might also be called, more generically, a plugin or extension.  Drivers exist for everything from block devices (like hard drives), network cards, and filesystems to input devices and displays.</li>
<li><b>Designed as a general purpose framework</b> &#8211; the operating system <i>doesn&#8217;t actually do any computing</i>; rather, it&#8217;s a set of services to facilitate processes using the resources of the computer.  To that end, they&#8217;re not designed with a specific process in mind, but rather to serve a large class of programs, each designed and written to accomplish a different task using a similar set of resources.</li>
</ul>
<h2>Analysis: the modern computing task</h2>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/ENIAC'><img src='http://upload.wikimedia.org/wikipedia/commons/archive/4/4e/20050923152626!Eniac.jpg' width='250'/></a></div>
<p>The first computer, <a href="http://en.wikipedia.org/wiki/ENIAC">ENIAC</a>, was conceived to do calculation of ballistics tables for artillery pieces &mdash; it was a glorified calculator. Lacking anything even resembling an operating system, it would just run its program. Its compiler? A group of six women who would configure the machine by hand with the program logic.  The input for its first test run, a calculation related to the hydrogen bomb project, was approximately <i>one million punch cards</i>.</p>
<p>Times have changed: 40 or so years of the unrelenting march of Moore&#8217;s Law in computing power has given us something like an <b><a href="http://upload.wikimedia.org/wikipedia/commons/thumb/c/c5/PPTMooresLawai.jpg/596px-PPTMooresLawai.jpg">eight order of magnitude increase</a></b> in the amount of computing power available per unit cost.  Coupled with similar,<a href="http://www.kk.org/thetechnium/archives/2009/07/was_moores_law.php"> more recent gains in storage capacity and network bandwidth</a>, this has produced a world awash in data, <a href='http://blog.palantirtech.com/2008/03/18/why-hal-varian-thinks-palantir-is-a-great-idea/'>crying out for analysis.</a></p>
<p>So the situation today is that we now expect to bring these considerable computing resources to bear on larger, more complex problems in the world.  I&#8217;m talking about things like the <a href="http://www.palantirtech.com/government/analysis-blog/traceback">spread of food-borne illnesses</a>, understanding the connection between genes and protein expression, <a href="http://www.palantirtech.com/government/analysis-blog/sinjar">understanding terrorist networks</a>, <a href="http://www.palantirtech.com/government/analysis-blog/uncovering-a-bot-net-exploring-router-data-using-palantir">finding botnets in network traffic logs</a>, and <a href="http://www.palantirtech.com/government/analysis-blog/transparency">exploring influence networks in government</a>.</p>
<p>These problems, while spanning a widely disparate areas of analysis, share some common traits:</p>
<h3>The data is spread out</h3>
<p>They are described by multiple data sources. Just to make things more interesting: the data sources don&#8217;t agree on their native representations of the real-world data. And finally, the real-world objects that the data are describing are actually described in multiple data sources, with no single source giving a complete and accurate representation.</p>
<h3>The data schema are not human-conceptual</h3>
<p>Rather than representing the data in some schema that maps easily into how the experts on a given problem think about said problem, the data stores in question tend to model data in whatever way was convenient for the creators of that particular data store. Put another way: people don&#8217;t think in tables, rows, columns, and XML snippets.  These first-class data storage elements don&#8217;t usually map to real-world objects.</p>
<h3>The data is sensitive</h3>
<p>Whether it&#8217;s patient information, <a href="http://www.palantirtech.com/government/analysis-blog/horizon">mortgage data</a>, a law enforcement investigation, or sensitive foreign intelligence, there is often the need for <a href="http://www.palantirtech.com/government/analysis-blog/mls">foolproof access controls on the data</a>.</p>
<h2>Palantir: an operating system-class abstraction for analysis</h2>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'><img src='http://blog.palantirtech.com/wp-content/uploads/2009/01/shot0016.png' width='250'/></div>
<p>A Palantir data server provides a similar class of services that an operating system does but focused on the specific needs of analytic tasks.  Here I&#8217;ll focus on the model used by Palantir Government; Palantir Finance uses a similar but significantly different approach to delivering these services.</p>
<p>As you might imagine, however, they both start at a somewhat higher level than punch cards.</p>
<h3>It starts with an ontology</h3>
<p>The Palantir approach to analysis begins with a task-specific ontology: essentially, a human-conceptual description of the real-world problem that&#8217;s being analyzed.</p>
<p>It&#8217;s roughly composed of three pieces:</p>
<ul>
<li>A hierarchical type system of the real-world objects that human experts use to think about this problem. We call these <i>PTObjects</i>, short for &#8220;Palantir Objects&#8221;.</li>
<li>A type system of properties that will contain the data describing these PTObjects.  PTObjects are essentially typed containers for properties. This is where most of the detail of the ontology lies.</li>
<li>A type system of possible relationships between different types of PTObjects.</li>
</ul>
<p>Within the ontology, there are numerous extension points that allow the customization of how data is imported, retrieved, and displayed (following the principle of <i>extending the abstraction in all directions</i>).</p>
<p>The data server takes the ontology as input and is agnostic to its content. This is where the principle of <i>building a general purpose framework</i> comes into play.</p>
<h3>The data sources are mapped into the ontology</h3>
<p>This part of the Palantir data server is a pattern that is very similar to an operating system&#8217;s notion of block device drivers. The difference? Instead of low-level storage systems like hard drives, we&#8217;re dealing with complex databases describing the problem at hand.</p>
<p>In an operating system, every block device can read and write blocks of data.  In the Palantir data server, everything becomes a source of PTObjects.</p>
<p>Our data importer plugins, by analogy,  fulfills the same role as a block device driver:<br />
we build glue code to map the data source&#8217;s schema into the ontology and the connectors to surface the data itself wrapped up in PTObjects.</p>
<h3>The data are composed into real-world objects.</h3>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='/wp-content/uploads/2009/11/pg-object-model.jpg'><img src='/wp-content/uploads/2009/11/pg-object-model.jpg' width='250'/></a>
</div>
<p>Part of this mapping is composing real-world objects into composite PTObjects by resolving PTObjects together.</p>
<p>The operation of resolving is pretty straightforward: we basically union the properties of the two PTObjects into a new PTObject. The end result is a single PTObject that completely represents all the data about something in the real-world from all the available data sources.</p>
<p>As we do this composition, we keep track of where each property came from, down to the record level, in each of its original sources.  (Note that most composed PTObjects will usually have at least one property that comes from two sources).  By preserving the original identity of every atom of data, it allows us to later decompose these PTObjects into their constituent parts or, more importantly, censor a client&#8217;s view based what permissions they have for each of the original data sources.</p>
<p>This a fundamental operation in our system that doesn&#8217;t have an exact analog in operating systems &#8212; it&#8217;s sort of similar to taking  multiple filesystems and mounting them inside a virtual filesystem tree, like Unix does.  However, if each data source is like a filesystem, what we&#8217;re doing is essentially composing individual files from their fragments stored on multiple block devices.</p>
<p>Another analogy: at a level below the block device in the OS, this is also sort of similar to what a <a href="http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0">RAID0</a> device does, the difference being that our composition is based on the contents of the data itself rather than some previously applied, content-agnostic, decomposition function.  The other difference being motivation: a RAID0 does it for performance, while Palantir is composing data to make it correspond to the real-world objects it represents.</p>
<h3>The server exposes Palantir &#8220;system calls&#8221;</h3>
<p>The interface that the Palantir data server exposes can be boiled down to two essential operations:</p>
<ul>
<li>The client can download copies of PTObjects from the server.  It may request them by id or perform some sort of search/query to specify a set of PTObjects.  This is roughly analogous to the <b><a href="http://en.wikipedia.org/wiki/Open_%28system_call%29">open()</a></b> and <b><a href="http://comsci.liu.edu/~murali/unix/read.htm">read()</a></b> system calls on Unix.
<p>Note that each client only sees the subset of properties for a given PTObject that it is authenticated for.  This censorship of full PTObjects into projected slices is something done by the server on every load of PTObjects.</li>
<li>The client can send new or updated PTObjects to the data server for storage. This is roughly analogous to the <b><a href="http://www.freebsd.org/cgi/man.cgi?query=write&#038;sektion=2&#038;manpath=FreeBSD+7.2-RELEASE">write()</a></b> system call in Unix. It, of course, entails a check as to whether the given client has permission to write to the given PTObject.</li>
</ul>
<p>The server&#8217;s responsibility is the same as the operating system: only let the client do what it has been granted permission to do.  In an operating system, the OS uses hardware features like <a href="http://en.wikipedia.org/wiki/Protected_mode">protected mode</a> to keep lower-privileged processes from accessing machine resources. Palantir uses network calls to achieve the same separation, by placing the client and server on different logical machines.  The effect is the same: the client basically requests (rather than commands) that certain operations are performed by the server.  The server uses its own rules to decide if the access or change is allowed and responds accordingly. And so the principle of <i>hard boundaries</i> is implemented.</p>
<h3>The clients do the analysis</h3>
<p>When an operating system yields to a process, that&#8217;s the time when the true processing begins.  By the same token, in Palantir, it&#8217;s not until a client connects and starts searching, visualizing, and manipulating PTObjects that analysis actually starts taking place (even if the server is doing a lot of the heavy lifting).</p>
<h2>The wide open future</h2>
<p>So why is this exciting?  I&#8217;m glad you asked!</p>
<h3>It&#8217;s about taking analysis to the next level.</h3>
<p>Let&#8217;s say you&#8217;re someone who wants to write an analytic task. Let me ask you a series of rhetorical questions:</p>
<ul>
<li>Do you want to start with three disparate sources of data or with the data already mapped into a Palantir data server?</li>
<li>Which one is a better use of your time as a programmer?</li>
<li>Which one allows you to not repeat mistakes that other programmers have already made and fixed?</li>
<li>Which one is more like writing a program than an operating system?</li>
</ul>
<p>Operating systems took us to a new level of expressiveness when it came to writing computing processes to run on computing hardware. It inverted that 85/15 ratio that Licklider talked about so that programmers spent more time writing the code that did the thing they were trying to create and less time mucking around with hardware.</p>
<p>More programmer time == better analytic tasks.</p>
<h3>It&#8217;s about making machine learning easier.</h3>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/Skynet_%28Terminator%29'><img src='http://images1.wikia.nocookie.net/terminator/images/8/8a/Cyberdyne_logo.jpg' width='250'/></a>
</div>
<p>Now consider machine learning as a field.  Pretty much every machine learning task could benefit from starting with its data in something that looks like a Palantir data server.  I&#8217;ve taken an informal survey of machine learning researchers and they agree: the 85/15 ratio still holds for machine learning.</p>
<p>Simply put: <b>most of the time and effort in machine learning is spent getting the data into a form that you can actually apply an algorithm to!</b> Now imagine if the starting point for that was a Palantir data server &mdash; now the machine learning implementer has a world of expressiveness open to them and time and energy are spent on the task at hand instead of the overhead of messing with the data.</p>
<p>Now, we don&#8217;t think that we&#8217;re building Skynet.  Quite the contrary: we believe that platforms like the one we&#8217;ve built will allow machine learning techniques to be put in the hands of experts to augment their ability to look at the world come to conclusions about complex real-world problems by asking questions of the data we&#8217;ve collected. It&#8217;s about <a href="http://en.wikipedia.org/wiki/Intelligence_amplification">Intelligence Augmentation</a>, which can use machine learning techniques and algorithms to build better tools, not creating <a href="http://en.wikipedia.org/wiki/Strong_AI">Strong AI</a>.</p>
<h3>It&#8217;s about creating new markets</h3>
<p>Let&#8217;s go back to the well of operating systems and look back at the history of MS-DOS: the first &#8220;killer&#8221; application on MS-DOS was <a href="http://en.wikipedia.org/wiki/VisiCalc">VisiCalc</a> (that screenshot at the top of this post), a text-based spreadsheet.  As you know, VisiCalc was not the end of the story but just the introduction. MS-DOS, evolved into Windows, allowed application writers an (arguably) clean abstraction on top of commodity hardware in order to build the applications that users actually wanted. Today, we have things like web browsers, multimedia authoring software, virtual machines, and IDEs built on top of what is, essentially, the same set of abstractions that VisiCalc was built on.</p>
<p>However, the most important thing to note is that VisiCalc is credited with creating the market for commercial operating systems &#8212; businesses needed VisiCalc so they paid Microsoft for MS-DOS (and IBM for a PC).  Without VisiCalc, there was no market for MS-DOS (most people, unsurprisingly, didn&#8217;t want to buy a <a href="http://en.wikipedia.org/wiki/Microsoft_BASIC">BASIC interpreter</a>).</p>
<p>We&#8217;re in the business of selling software and we agree with our customers: the Palantir approach has tremendous value.  We&#8217;ve just started tapping the potential of this market.  Think about what Oracle looked like in 1979, think what Microsoft looked like in 1980 &mdash; that&#8217;s Palantir in 2009.</p>
<h3>It&#8217;s about the start of the analysis age</h3>
<div style='float: right; text-align: right; margin-right: 15px; margin-left: 15px'>
<a href='http://en.wikipedia.org/wiki/Information_Age'><img src='http://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/Internet_map_1024.jpg/600px-Internet_map_1024.jpg' width='250'/></a>
</div>
<p>It can be argued that the operating system is the innovation that ushered in the &#8220;<a href="http://en.wikipedia.org/wiki/Information_Age">information age</a>&#8220;.  Without the operating system, there is no software explosion, which allows computing technology to actually be used on data in the world.</p>
<p>We think that we&#8217;re on the cusp of the analysis age, as imagined by <a href="http://en.wikipedia.org/wiki/Vernor_Vinge">Vernor Vinge</a> in <u><a href="http://books.google.com/books?id=SrLwPdBJodMC&#038;dq=rainbow%27s+end&#038;printsec=frontcover&#038;source=bn&#038;hl=en&#038;ei=TdX0Sui9HsTh8AbGlc3zCQ&#038;sa=X&#038;oi=book_result&#038;ct=result&#038;resnum=5&#038;ved=0CBsQ6AEwBA#v=onepage&#038;q=&#038;f=false">Rainbow&#8217;s End</a></u>.  It was something foreseen by Licklider in 1960, albeit with a timeline that was off by at least a few decades:</p>
<blockquote><p>
“…it seems worthwhile to avoid argument with (other) enthusiasts for artificial intelligence by conceding dominance in the distant future of cerebration to machines alone. There will nevertheless be a fairly long interim during which the main intellectual advances will be made by men and computers working together in intimate association. A multidisciplinary study group, examining future research and development problems of the Air Force, estimated that it would be 1980 before developments in artificial intelligence make it possible for machines alone to do much thinking or problem solving of military significance. That would leave, say, five years to develop man-computer symbiosis and 15 years to use it. The 15 may be 10 or 500, but those years should be intellectually the most creative and exciting in the history of mankind.”
</p></blockquote>
<p>It&#8217;s a golden age of analysis and we&#8217;re just getting started: we&#8217;ve got a lot of work to do, so if this sort of thing excites you, please <a href='http://www.palantirtech.com/careers/culture'>come and join us.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Deploying a distributed system</title>
		<link>http://blog.palantirtech.com/2008/10/07/deploying-a-distributed-system/</link>
		<comments>http://blog.palantirtech.com/2008/10/07/deploying-a-distributed-system/#comments</comments>
		<pubDate>Wed, 08 Oct 2008 03:52:32 +0000</pubDate>
		<dc:creator>Bob McGrew</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[development process]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=118</guid>
		<description><![CDATA[At Palantir, we write software that gets deployed at each client, integrated across their sensitive data sets, and maintained and administered by that client&#8217;s in-house admins. Most deployed enterprise software is run on a single beefy box: consider wikis, blogging systems, bug tracking systems, or practically any client/server or web client software software used today. [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; text-align: right; width: 230px'><img src="http://blog.palantirtech.com/wp-content/uploads/2008/10/pg-distrib-logo.png" alt="Distributed systems diagram" title="Distributed systems diagram" width="221" height="300" class="size-medium wp-image-466" /></div>
<p>At Palantir, we write software that gets deployed at each client, integrated across their sensitive data sets, and maintained and administered by that client&#8217;s in-house admins.  Most deployed enterprise software is run on a single beefy box: consider wikis, blogging systems, bug tracking systems, or practically any client/server or web client software software used today.  On the other hand, most enterprise software that runs as a <a href="http://en.wikipedia.org/wiki/Distributed_system">distributed system</a> is hosted: Salesforce.com, Google Apps, or any approach that sells software as a <a href="http://en.wikipedia.org/wiki/Software_as_a_Service">service</a>.  What’s fairly unusual about our software is that it’s deployed as a distributed system at each client.</p>
<p>Distributed systems are hard to build and hard to maintain.  As long as that distributed system is built and maintained in-house, however, you have a number of advantages:</p>
<ul>
<li> The administrators are full-time product experts who are focused on the mission of keeping your system available and responsive.
<li> The development organization can build internal tools for the administrators that only have to be “good enough” and can step in if necessary.
<li> It’s easy to get feedback on how the system performs, because there are no sensitivity, privacy, or legal constraints.
<li> A single, large deployment allows you to optimize your hardware purchasing and amortize installation headaches across a large number of machines.
</ul>
<p>This is all great, of course, and if you can host and maintain your distributed system yourself, I’d highly recommend it.  Sometimes, however, it’s just not possible.  At Palantir, the client data we work with is so sensitive that even we cannot see it, except under very strictly controlled circumstances.  It’s also so large that the bandwidth limitations of pushing it into a system hosted by us would be prohibitive.</p>
<p>So suppose that you have to deploy your distributed system in a customer datacenter with external parties maintaining the system.  What do you need to consider?  In this post, I&#8217;ll go into a number of key points that we have faced and addressed at Palantir.</p>
<p><span id="more-118"></span></p>
<h3>Understand Your Administrators</h3>
<p>Assume that your administrators are part-time, not product experts, and constantly distracted by their other responsibilities.  They aren’t even experts in the technologies your system is based on: for example, they don’t really know much about databases and they are more comfortable with Windows than with Linux.  Even if these assumptions aren’t all true in any particular case, there will be administrators who meet each of these assumptions.</p>
<h3>Design For Manageablility</h3>
<p> This means building powerful management tools for your system that are web-based and also scriptable.  Remember that your administrators are part-time, so usability is important: by the time your administrator touches the Foobar Configuration Widget the second time, he’s forgotten everything he learned a month ago when he did it the first time.  You also want to build management tools that go all the way down the stack: using low-level tools for occasional jobs leads to mistakes, because those low-level tools tend to be far more powerful than necessary for your system.</p>
<h3>Design In Monitoring And Notification</h3>
<p>Visibility is one of the biggest reasons people want control – but unnecessary control leads to mistakes.  Your administrator shouldn’t have to go to the command line to run <code><a href="http://en.wikipedia.org/wiki/Top_(Unix)">top</a></code> just to figure out whether the system is overloaded.  Each metric that is being monitored needs to have historical data so that your system can distinguish baseline behavior from anomalies.  Each metric displayed to an administrator needs to be displayed with context, whether that’s the mean and standard deviation of the metric, or whether it’s similar metrics on other servers.   Anomalous  behavior should trigger human action through a notification.</p>
<p>Notifications also need to be carefully designed (as well as extensible by the administrator).  Carefully distinguish actionable items from non-actionable items, and try to reduce ambiguity as to what action is required.  It’s similar to error logging: if you let standard system events pollute your error logs, the administrator will soon stop paying attention to them.  Although you may not be able to send monitoring information directly back to the development team, you may also want to prepare reports of what’s gone wrong to collect every so often; just make sure that these reports are human-readable so that they can be vetted to make sure they don’t leak any sensitive information.</p>
<h3>Design For Autonomy</h3>
<p>Where possible, design the system to handle error conditions that can be systemically fixed. The best kind of failure is one that requires no admin intervention.  If you can automatically extend your <a href="http://en.wikipedia.org/wiki/Tablespace">tablespace </a>when it runs out of allocated space, do it.  But be sure to give sufficient warning if a long lead-time action is going to be required (like ordering and installing an additional disk).  You won’t be able to figure out everything that can go wrong ahead of time, but you can iterate to drive down the number of events for which human intervention is required.</p>
<p>In future posts, we plan to drill down on each of these challenges and look at what approaches and technologies worked for us.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2008/10/07/deploying-a-distributed-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Palantir: so what is it you guys do?</title>
		<link>http://blog.palantirtech.com/2007/12/04/what-do-we-do/</link>
		<comments>http://blog.palantirtech.com/2007/12/04/what-do-we-do/#comments</comments>
		<pubDate>Tue, 04 Dec 2007 08:01:18 +0000</pubDate>
		<dc:creator>Kevin Simler</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[palantirtech]]></category>
		<category><![CDATA[problemspace - finance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/2007/12/04/what-do-we-do/</guid>
		<description><![CDATA[I often ask candidates if they&#8217;re familiar with what we do at Palantir. Most people think they are. &#8220;Oh, you&#8217;re that data viz. company,&#8221; or, worse, &#8220;You guys do data mining, right?&#8221; At least they&#8217;ve heard of us and at least they&#8217;re on the right track, but I cringe anyway. We aren&#8217;t just a &#8220;data [...]]]></description>
			<content:encoded><![CDATA[<p>I often ask candidates if they&#8217;re familiar with what we do at Palantir.  Most people think they are.  &#8220;Oh, you&#8217;re that data viz. company,&#8221; or, worse, &#8220;You guys do data mining, right?&#8221;  At least they&#8217;ve heard of us and at least they&#8217;re on the right track, but I cringe anyway.  We aren&#8217;t just a &#8220;data visualization&#8221; company and we don&#8217;t do &#8220;data mining.&#8221;  It&#8217;s almost impossible to convey the scope and complexity of what we do in a few short minutes&#8212;or to do so without taking the conversation to an eye-glazing level of abstraction.</p>
<p>The following is my attempt at describing what we do at a high level without oversimplifying.  I hope that after reading this a candidate will &#8216;get&#8217; what we&#8217;re about, or at least understand enough not to apply tiny labels to our expansive vision.</p>
<p><span id="more-82"></span></p>
<h2>The problem: implementing analysis</h2>
<p>At Palantir we specialize in <strong>analysis</strong>.</p>
<p>Yes, that&#8217;s painfully abstract, and I&#8217;ll get to it in a second.</p>
<p>In real-world terms, we are building a <strong>software platform</strong> that enables people to take whatever data is relevant to them and understand it more easily and thoroughly than ever before, using concepts that they already understand.  And we are applying this vision, at first, to solving problems in the finance sector and the government intelligence community.</p>
<p>The first important thing to note is that we don&#8217;t actually do the analysis ourselves.  We don&#8217;t devise winning trading strategies and we don&#8217;t catch terrorists.  We write software that enables other people to pull off these feats.  These people, experts in their respective fields, are called <em>analysts.</em></p>
<p>So what exactly do analysts do?  What is analysis?</p>
<blockquote><p>Analysis is everything necessary to extract <strong>insight</strong> from <strong>information</strong>.</p></blockquote>
<p>Let&#8217;s break that down a bit.</p>
<p>Information is easy:  It&#8217;s data.  It lives in a relational database or as files indexed on a hard drive, and you can easily run queries against it.  It comes in two forms, structured and unstructured.  And there is <em>a lot</em> of it in the modern world &#8211; too much, actually, for current tools to make sense of.</p>
<p>Insight is trickier.  Insight is something only a person can generate, and understanding this is critical for any organization that wants to do analysis right.  Thus the challenge of data analysis is how to bring vast amounts of information into productive contact with human intelligence.  In other words, the challenge is how to <em>enable the analyst</em>.</p>
<p>From the analyst&#8217;s perspective there are five essential features of an analysis platform:</p>
<ol>
<li>First, and most important, <em><strong>the analyst should be in control</strong></em>.  In other words, the primary way of interacting with an analysis tool should be <em>human-driven queries</em>.  While automated approaches can complement a human-driven approach, there simply is no substitute for human intelligence.  Unless you put a person behind the wheel, the system can never be flexible or creative enough to uncover truly original insight.  Artificial Intelligence just isn&#8217;t there yet.</li>
<li>Ability to <em><strong>summarize large data sets</strong></em>.  Some of this is what has traditionally been called data mining:  the largely automated approach&#8212;using machine learning or other statistical techniques&#8212;of processing lots of data at once and extracting nuggets that capture something interesting about the data.  Unlike Palantir, traditional approaches have focused almost exclusively on this aspect of analysis.</li>
<li>Ability to <em><strong>visualize large data sets</strong></em>.  Here the analyst wants interesting and informative ways of viewing data graphically, to make it easier for him to digest.  The analyst wants more than just a summary of the data; he wants a nuanced view of what&#8217;s going on <em>inside</em> these data sets:  What&#8217;s the overall shape of the distribution?  What are the outliers?  What are important structures within the data?</li>
<li>Ability to <em><strong>iterate rapidly</strong></em>.  This means enabling the analyst to ask a question, get the answer, and then quickly ask either a variant on the initial question or a follow-up question that depends on the answer to the initial question.  This rapid, iterative process allows the analyst to quickly test out hypotheses and develop theories about what&#8217;s going on in the data, and by extension to discover what&#8217;s going on in the world.</li>
<li>Ability to <em><strong>collaborate with other analysts</strong></em>.  Getting a handle on a terabyte of data, especially when it comprises multiple data types, is definitely more than a one-person job.  Any organization that&#8217;s serious about understanding the world needs a team of analysts that can work together as more than the sum of its parts.  This requires the ability for one analyst to effortlessly share the results of his analysis with his colleagues.</li>
</ol>
<h2>The Palantir approach</h2>
<p>That&#8217;s what analysis looks like to the analyst, or rather what it should look like in an ideal world.  (Current tools fall far short of this vision.)  So what do <em>we</em> do at Palantir in order to make analysis this smooth and easy?</p>
<p>You could say that we help summarize large data sets, in the sense that we have to provide the analyst with a rich library of techniques and algorithms.  You could also say that we do visualization, in the sense that we have to provide the analyst with a set of interesting and informative ways of visualizing their data.  We do both of these things, and we have to be creative and solve hard problems in order to add value in these areas.  But we do a lot more than that.</p>
<p>Probably the most central hard problem that we address in trying to enable the analyst is <strong>data modeling</strong>, the process of figuring out what data types are relevant to a domain, defining what they represent in the world, and deciding how to represent them in the system.  At Palantir we make sure our data model (ontology) is both flexible and dynamic, and that it mirrors the concepts people naturally use when reasoning about the domain.  This is no small challenge, but we&#8217;re already making it a reality.  In finance our basic data types include financial instruments, dates, portfolios, indices, and strategies&#8212;the same things that financial researchers think about, talk about, and reason with.  In the intelligence product our basic data types include people, places, and events (all with associated properties), which is exactly the way we all represent the world in our minds.</p>
<p>Data modeling, data summarization, and data visualization are the core disciplines for approaching large data sets.  Human-driven queries, rapid iteration, and collaboration are multipliers, taking the power unlocked by the core disciplines to the next level.  When these pieces are brought together in a coherent system, the result is in an analysis platform both very generic and very powerful.</p>
<p>This is what we mean when we say that we&#8217;re changing the way people approach data.  Welcome to the future of analysis.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2007/12/04/what-do-we-do/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Palantir: embodying a 50-year-old vision of the future?</title>
		<link>http://blog.palantirtech.com/2007/03/16/human-computer-symbiosis/</link>
		<comments>http://blog.palantirtech.com/2007/03/16/human-computer-symbiosis/#comments</comments>
		<pubDate>Fri, 16 Mar 2007 22:53:48 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/2007/03/16/human-computer-symbiosis/</guid>
		<description><![CDATA[Here at Palantir, Charles Cooper&#8217;s recent piece on CNET about J. C. R. Licklider has struck us as a very timely piece of journalism. Licklider was an very influential man, with Cooper even crediting him for the existence of Computer Science as a modern-day field: Until Licklider began his work at ARPA, there were no [...]]]></description>
			<content:encoded><![CDATA[<p>Here at Palantir, <a href="http://news.com.com/Lickliders+vision+of+the+Digital+Age/2010-1012_3-6167919.html">Charles Cooper&#8217;s recent piece on CNET</a> about <a href="http://www.ibiblio.org/pioneers/licklider.html">J. C. R. Licklider</a> has struck us as a very timely piece of journalism.</p>
<p>Licklider was an very influential man, with Cooper even crediting him for the existence of Computer Science as a modern-day field:</p>
<blockquote><p>Until Licklider began his work at ARPA, there were no Ph.D. programs in computer science at American universities. That changed after ARPA began handing out grants to promising students, a practice that convinced MIT, Stanford, UC Berkeley and Carnegie Mellon to start their own graduate programs in computer science in 1965. Maybe that should go down as Licklider&#8217;s most lasting legacy.
</p></blockquote>
<p>In the piece, Cooper references this influential and well known work by Licklider: <a href="http://blog.palantirtech.com/man-computer-symbiosis/"><em>Man-Computer Symbiosis</em></a>, by <a href="http://www.ibiblio.org/pioneers/licklider.html">J. C. R. Licklider</a>, published in IRE Transactions on Human Factors in Electronics, volume HFE-1, pages 4-11, March 1960.</p>
<p>That&#8217;s right, it was written almost 50 years ago.  That said, it&#8217;s incredibly relevant today, perhaps more than ever. </p>
<p>Here&#8217;s the abstract:</p>
<blockquote><p>Man-computer symbiosis is an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs. In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking. Preliminary analyses indicate that the symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them. Prerequisites for the achievement of the effective, cooperative association include developments in computer time sharing, in memory components, in memory organization, in programming languages, and in input and output equipment.</p></blockquote>
<p>This description is still a pretty accurate description of how most analysts (in any industry or field) go about their business:</p>
<blockquote><p>
Despite the fact that there is a voluminous literature on thinking and problem solving, including intensive case-history studies of the process of invention, I could find nothing comparable to a time-and-motion-study analysis of the mental work of a person engaged in a scientific or technical enterprise. In the spring and summer of 1957, therefore, I tried to keep track of what one moderately technical person actually did during the hours he regarded as devoted to work. Although I was aware of the inadequacy of the sampling, I served as my own subject.<br />
&#8230;<br />
About 85 per cent of my &#8220;thinking&#8221; time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it. Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so.<br />
&#8230;<br />
Throughout the period I examined, in short, my &#8220;thinking&#8221; time was devoted mainly to activities that were essentially clerical or mechanical: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability.
</p></blockquote>
<p>This quote is an eerily accurate description of how trading strategies are formulated, back-tested, and implemented these days.  As analogy, it&#8217;s also an accurate reflection of the modern use of information processing in the intelligence space.</p>
<p>To wit:</p>
<blockquote><p>It is to bring computing machines effectively into processes of thinking that must go on in &#8220;real time,&#8221; time that moves too fast to permit using computers in conventional ways. Imagine trying, for example, to direct a battle with the aid of a computer on such a schedule as this. You formulate your problem today. Tomorrow you spend with a programmer. Next week the computer devotes 5 minutes to assembling your program and 47 seconds to calculating the answer to your problem. You get a sheet of paper 20 feet long, full of numbers that, instead of providing a final solution, only suggest a tactic that should be explored by simulation. Obviously, the battle would be over before the second step in its planning was begun. To think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own will require much tighter coupling between man and machine than is suggested by the example and than is possible today.
</p></blockquote>
<p>So what how does this relate to what we do? In the finance world, much of what fund managers and analysts do in building strategies has to do with formulating trading models and then building spreadsheets that can back test or simulate the performance of those models.</p>
<p>Our finance tool obviates the need for this &#8220;clerical, mechanical&#8221; work, allowing strategists to spend more time making sense of the interconnections in the market and formulating nuanced trading strategies and less time doing model-building in Excel.  We take the state-of-art a quantum leap forward in terms of financial analysis: rather than even just allowing analysts to quickly build models and back test trading strategies, we&#8217;ve built a tool that allows for a smooth flow from hypothesis to theory with the software doing all the heavy lifting, data wrangling, eye-candy-class presentation.  New variables or market conditions can be incorporated on the fly without the need for a pause from high-level thinking to gather data or marshal it into the right format.  Knowledge can be divined by asking questions relative to high-level concepts of things like dynamic market conditions and meta-conditions like the volatility-of-volatility.</p>
<p>The question has traditionally been, &#8220;How do I effectively model this financial space?&#8221;  With Palantir, we&#8217;re transforming that question into the core question asked in the finance industry, namely, &#8220;How can I better understand the interactions at work in today&#8217;s markets?&#8221;  So the focus moves to the human-level questions while the software takes care of the data level machinations.</p>
<p>In the intelligence space, the composite views of data that the government team creates save the analysts from having to painstakingly research and record correlations across multiple informational domains.  Instead, the analyst can spend time divining the meaning behind the connections and correlations. Our take on <a href="http://jeffjonas.typepad.com/jeff_jonas/2006/02/what_do_you_kno.html">perpetual analytics</a> takes things a step further, alerting the analyst as relevant new information enters the system. And finally, we&#8217;re building workflows that allow analysts to quickly attach &#8216;handles&#8217; to data to allow what has been traditionally unstructured data get seat at this table of computer-enhanced human analysis.</p>
<p>We&#8217;re speeding up the process of analysis by creating an analyst-computer symbiosis.  No longer will people need to  spend time doing menial data processing, the computers will do it for them, while the humans provide the spark of insight, semantics, and cognition that computers lack.</p>
<p><strong>It&#8217;s conceptual analysis at the speed of thought</strong>.</p>
<p>This is why I&#8217;m excited to come to work every day: we&#8217;re building the software that embodies a broad vision of the future. This vision of human computer symbiosis dates from five decades ago but is also apparent in every interaction we see with computers on the big and small screens (no, not our monitors).  From Star Trek to 24, people want to the computers to do the repetitive and time-consuming simple work but let them have final say on any complex decisions. As one of our customers told us when shown our application: <strong>this is the future.</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2007/03/16/human-computer-symbiosis/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Thoughts on security</title>
		<link>http://blog.palantirtech.com/2007/02/14/thoughts-on-security/</link>
		<comments>http://blog.palantirtech.com/2007/02/14/thoughts-on-security/#comments</comments>
		<pubDate>Thu, 15 Feb 2007 01:35:21 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[softwarephilosophy]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/2007/02/14/thoughts-on-security/</guid>
		<description><![CDATA[A quick read on the pitfalls of designing computer security, The Six Dumbest Ideas In Security: Let me introduce you to the six dumbest ideas in computer security. What are they? They&#8217;re the anti-good ideas. They&#8217;re the braindamage that makes your $100,000 ASIC-based turbo-stateful packet-mulching firewall transparent to hackers. Where do anti-good ideas come from? [...]]]></description>
			<content:encoded><![CDATA[<p>A quick read on the pitfalls of designing computer security, <a href="http://www.ranum.com/security/computer_security/editorials/dumb/">The Six Dumbest Ideas In Security</a>:</p>
<blockquote><p>Let me introduce you to the six dumbest ideas in computer security. What are they? They&#8217;re the anti-good ideas. They&#8217;re the braindamage that makes your $100,000 ASIC-based turbo-stateful packet-mulching firewall transparent to hackers. Where do anti-good ideas come from? They come from misguided attempts to do the impossible &#8211; which is another way of saying &#8220;trying to ignore reality.&#8221; Frequently those misguided attempts are sincere efforts by well-meaning people or companies who just don&#8217;t fully understand the situation, but other times it&#8217;s just a bunch of savvy entrepreneurs with a well-marketed piece of junk they&#8217;re selling to make a fast buck. In either case, these dumb ideas are the fundamental reason(s) why all that money you spend on information security is going to be wasted, unless you somehow manage to avoid them.</p></blockquote>
<p>A well-written piece that&#8217;s worth reading for anyone that&#8217;s implementing computer security, either at an operational level or as a software engineer.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2007/02/14/thoughts-on-security/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

