<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Palantir Technologies &#187; enterprise engineering</title>
	<atom:link href="http:///category/enterprise-engineering/feed/" rel="self" type="application/rss+xml" />
	<link></link>
	<description>Articles from the Engineering Group at Palantir Technologies</description>
	<lastBuildDate>Wed, 14 Dec 2011 17:48:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>How to Rock a Systems Design Interview</title>
		<link>http://blog.palantirtech.com/2011/10/28/how-to-rock-a-systems-design-interview/</link>
		<comments>http://blog.palantirtech.com/2011/10/28/how-to-rock-a-systems-design-interview/#comments</comments>
		<pubDate>Fri, 28 Oct 2011 15:00:41 +0000</pubDate>
		<dc:creator>John Carrino</dc:creator>
				<category><![CDATA[development process]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[interviewing]]></category>
		<category><![CDATA[palantir]]></category>
		<category><![CDATA[softwarephilosophy]]></category>
		<category><![CDATA[tips and tricks]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1937</guid>
		<description><![CDATA[Comic courtesy of XKCD, via Creative Commons License Note: this third installment in our series on doing your best in interviews. Previously: &#8220;How to Rock an Algorithms Interview&#8221; and &#8220;The Coding Interview&#8221;. One interview that candidates often struggle with is the systems design interview. Even if you know your algorithms and write clean code, that [...]]]></description>
			<content:encoded><![CDATA[<div style='text-align: center'><a href='https://www.xkcd.com/754/'><img style='width: 100%' src='/wp-content/uploads/2011/10/dependencies.png' alt='Compiler design dependency comic, originally from http://www.xkcd.com/754/' title='Comic originally from http://www.xkcd.com/754/' /></a>
<div style='text-align: right; font-size: 0.6em; margin-bottom: 1em;'>Comic courtesy of <a href='http://www.xkcd.com/754/'>XKCD</a>, via Creative Commons License</div>
</div>
<p>
<span style='font-size: 0.7em'><em>Note: this third installment in our series on doing your best in interviews.  Previously: <a href="/2011/09/26/how-to-rock-an-algorithms-interview/" title="How to Rock an Algorithms Interview" target="_blank">&#8220;How to Rock an Algorithms Interview&#8221;</a> and <a href="/2011/10/03/the-coding-interview/" title="The Coding Interview" target="_blank">&#8220;The Coding Interview&#8221;</a>.</em></span>
</p>
<p>One interview that candidates often struggle with is the systems design interview. Even if you know your algorithms and write clean code, that code needs to run on a computer somewhere &mdash; and then things quickly get complicated. A truly unbelievable amount of complexity lies beneath something as simple as <a href="https://plus.google.com/112218872649456413744/posts/dfydM2Cnepe">visiting Google in your browser</a>. While most of that complexity is abstracted away from the end user, as a system designer you have to face it head on, and the more you can handle, the better.</p>
<p>At Palantir, many of our teams give a systems design interview along with an <a href="http://blog.palantir.com/2011/09/26/how-to-rock-an-algorithms-interview/">algorithms interview</a> and a couple of <a href="http://blog.palantir.com/2011/10/03/the-coding-interview/">coding interviews</a>. We don’t expect anyone to be an expert at all three disciplines (although some are). We’re looking for generalists with depth &mdash; people who are good at most things, and great at some. If systems design isn&#8217;t your strength, that’s okay, but you should at least be able to talk and reason competently about a complex system.</p>
<p>Read on to learn about what we&#8217;re looking for and how you can prepare.</p>
<p><span id="more-1937"></span></p>
<h2>We’re measuring three things</h2>
<p>Nominally, this interview appears to require knowledge of <strong>systems</strong> and a knack for <strong>design</strong> &mdash; and it does. What makes it interesting, though, and sets it apart from a coding or an algorithms interview, is that whatever solution you come up with during the interview is just a side effect. What we actually care about is the process. </p>
<p>In other words, the systems design interview is all about <strong>communication</strong>. </p>
<p>This reflects what actually working at Palantir is like. As engineers we have a tremendous amount of freedom. We aren’t asked to implement fully-specced features. Instead we take ownership of <em>open-ended problems</em>, and it’s our job to come up with the best solution to each. We need people we can trust to do the right thing without a lot of supervision &mdash; people who can own large projects and take them consistently in the right direction. Invariably, this means being able to communicate effectively with the people around you. Working on <a href="http://blog.palantir.com/2009/11/06/palantir-like-an-operating-system-for-data-analysis/">problems with huge scope</a> isn&#8217;t something you can do in a vacuum.</p>
<h2>It&#8217;s an open-ended conversation</h2>
<p>Usually we’ll start by asking you to design a system that performs a given task. The prompt will be simple, but don’t be fooled &mdash; these problems are wide and bottomless, and the point of the interview is to see how much volume you can cover in 45 minutes.</p>
<p>For the most part, you’ll be steering the conversation. It’s up to you to understand the problem. That might mean asking questions, sketching diagrams on the board, and bouncing ideas off your interviewer. Do you know the constraints? What kind of inputs does your system need to handle? You have to get a sense for the scope of the problem before you start exploring the space of possible solutions. And remember, there is no single right answer to a real-world problem. Everything is a tradeoff.</p>
<h2>Topics</h2>
<p>Systems are complex, and when you’re designing a system you’re grappling with its full complexity. Given this, there are many topics you should be familiar with, such as:</p>
<ul>
<li><b>Concurrency.</b> Do you understand threads, deadlock, and starvation? Do you know how to parallelize algorithms? Do you understand consistency and coherence?</li>
<li><b>Networking.</b> Do you roughly understand <a href='https://secure.wikimedia.org/wikipedia/en/wiki/Inter-process_communication'>IPC</a> and <a href='https://secure.wikimedia.org/wikipedia/en/wiki/Internet_Protocol_Suite'>TCP/IP</a>? Do you know the difference between throughput and latency, and when each is the relevant factor?</li>
<li><b>Abstraction.</b> You should understand the systems you’re building upon. Do you know roughly how an OS, file system, and database work? Do you know about the various levels of caching in a modern OS?</li>
<li><b>Real-World Performance.</b> You should be familiar with the <a href="http://everythingisdata.wordpress.com/2009/10/17/numbers-everyone-should-know/">speed of everything</a> your computer can do, including the relative performance of RAM, disk, SSD and your network.
<li><b>Estimation.</b> Estimation, especially in the form of a back-of-the-envelope calculation, is important because it helps you narrow down the list of possible solutions to only the ones that are feasible. Then you have only a few prototypes or micro-benchmarks to write.</li>
<li><b>Availability and Reliability.</b> Are you thinking about how things can fail, especially in a <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Fallacies_of_Distributed_Computing">distributed environment</a>? Do know how to design a system to cope with network failures? Do you understand durability?</li>
</ul>
<p>Remember, we&#8217;re not looking for mastery of all these topics. We&#8217;re looking for <em>familiarity</em>. We just want to make sure you have a good lay of the land, so you know which questions to ask and when to consult an expert.</p>
<h2>How to prepare</h2>
<p>How do you get better at something? If your answer isn’t along the lines of &#8220;practice&#8221; or &#8220;hard work,&#8221; then I have a bridge to sell you. Just like you have to write a lot of code to get better at coding and do a lot of drills to get really good at basketball, you’ll need practice to get better at design. Here are some activities that can help:</p>
<ul>
<li><strong>Do mock design sessions.</strong> Grab an empty room and a fellow engineer, and ask her to give you a design problem, preferably related to something she&#8217;s worked on. Don&#8217;t think of it as an interview &mdash; just try to come up with the best solution you can. Design interviews are similar to actual design sessions, so getting better at one will make you better at the other.</li>
<li><strong>Work on an actual system</strong>. Contribute to OSS or build something with a friend. Treat your class projects as more than just academic exercises &mdash; actually focus on the architecture and the tradeoffs behind each decision. As with most things, the best way to learn is by doing.</li>
<li><strong>Do back-of-the-envelope calculations for something you&#8217;re building and then write micro-benchmarks to verify them.</strong> If your micro-benchmarks don&#8217;t match your back-of-the-envelope numbers, some part of your mental model will have to give, and you&#8217;ll learn something in the process.</li>
<li><strong>Dig into the performance characteristics of an open source system.</strong>  For example, take a look at <a href="https://code.google.com/p/leveldb/">LevelDB</a>.  It&#8217;s new and clean and small and well-documented. Read about the <a href="http://leveldb.googlecode.com/svn/trunk/doc/impl.html">implementation</a> to understand how it stores its data on disk and how it compacts the data into levels. Ask yourself questions about tradeoffs: which kinds of data and sizes are optimal, and which degrade read/write performance? <em>(Hint: think about random vs. sequential writes.)</em>
<li><strong>Learn how databases and operating systems work</strong> under the hood. These technologies are not only tools in your belt, but also a great source of design inspiration. If you can  think like a DB or an OS and understand how each solves the problems it was designed to solve, you&#8217;ll be able to apply that mindset to other systems.</li>
</ul>
<h2>Final thought: relax and be creative</h2>
<p>The systems design interview can be difficult, but it&#8217;s also a place to be creative and to take joy in the imagining of systems unbuilt. If you listen carefully, make sure you fully understand the problem, and then take a clear, straightforward approach to communicating your ideas, you should do fine.</p>
<p>Good luck!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/10/28/how-to-rock-a-systems-design-interview/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Inside Horizon: interactive analysis at cloud scale</title>
		<link>http://blog.palantirtech.com/2011/04/15/inside-horizon-interactive-analysis-at-cloud-scale/</link>
		<comments>http://blog.palantirtech.com/2011/04/15/inside-horizon-interactive-analysis-at-cloud-scale/#comments</comments>
		<pubDate>Fri, 15 Apr 2011 19:04:46 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[distributed systems]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[enterprise software]]></category>
		<category><![CDATA[Human-Computer Symbiosis]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1837</guid>
		<description><![CDATA[Late last year, we were honored to be invited to talk at Reflections&#124;Projections, ACM@UIUC&#8217;s annual student-run computing conference. We decided to bring a talk about Horizon, our system for doing aggregate analysis and filtering across very large amounts of data. The video of the talk was posted a few weeks back on the conference website. [...]]]></description>
			<content:encoded><![CDATA[<div style='width: 250; margin-left: 10px; margin-bottom: 10px; float: right;'><a href="http://www.acm.uiuc.edu/conference/2010/"><img src="http://blog.palantir.com/wp-content/uploads/2011/03/reflectionsprojections.png" alt="" title="reflectionsprojections" width="250" height="215"/></a></div>
<p>Late last year, we were honored to be invited to talk at Reflections|Projections, ACM@UIUC&#8217;s annual student-run computing conference.  We decided to bring a talk about Horizon, our system for doing aggregate analysis and filtering across very large amounts of data.  The video of the talk was posted a few weeks back on <a href="http://www.acm.uiuc.edu/Conferenceware/Schedule/Videos">the conference website</a>.</p>
<p>Horizon started as research project / technology demonstrator built as part of Palantir&#8217;s Hack Week &#8211; a periodic innovation sprint that our engineering team uses to build brand new ideas from whole cloth.  It was then used by the Center For Public Integrity in their <a href="http://www.publicintegrity.org/investigations/economic_meltdown/">Who&#8217;s Behind The Subprime Meltdown</a> report.  We produced a short video on the subject, <a href="http://www.palantirtech.com/government/analysis-blog/horizon">Beyond the Cloud: Project Horizon</a>, released on our analysis blog.  Subsequently, it was folded into our product offering, under the name <a href="http://www.palantirtech.com/labs/object-explorer">Object Explorer</a>.</p>
<p>In this hour-long talk, two of the engineers that built this technology tell the story of how Horizon came to be, how it works, and show a live demo of doing analysis on hundreds of millions of records in interactive time.</p>
<p><iframe title="YouTube video player" width="640" height="510" src="http://www.youtube.com/embed/9dOpDeRMTMc" frameborder="0" allowfullscreen></iframe></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/04/15/inside-horizon-interactive-analysis-at-cloud-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Decorator Pattern: Implementing decorators using forwarding classes</title>
		<link>http://blog.palantirtech.com/2011/03/01/decorator-pattern-implementing-decorators-using-forwarding-classes/</link>
		<comments>http://blog.palantirtech.com/2011/03/01/decorator-pattern-implementing-decorators-using-forwarding-classes/#comments</comments>
		<pubDate>Tue, 01 Mar 2011 21:29:18 +0000</pubDate>
		<dc:creator>Allen Chang</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[tips and tricks]]></category>

		<guid isPermaLink="false">https://wp-admin-techblog.yojoe.local/?p=1821</guid>
		<description><![CDATA[A forwarding class is an abstract base class which makes it easier to implement decorators for a particular interface. A forwarding class simply forwards all calls it receives to some delegate; a decorator can then be implemented by extending the forwarding class and overriding the relevant methods. Here&#8217;s an example implementation of a forwarding class [...]]]></description>
			<content:encoded><![CDATA[<p>A forwarding class is an abstract base class which makes it easier to implement decorators for a particular interface. A forwarding class simply forwards all calls it receives to some delegate; a <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Decorator_pattern">decorator</a> can then be implemented by extending the forwarding class and overriding the relevant methods. Here&#8217;s an example implementation of a forwarding class for the <code>Collection&lt;E&gt;</code> interface (from <a href="https://code.google.com/p/google-collections/">Google Collections</a>):</p>
<pre class="brush: java; title: ; notranslate">
public abstract class ForwardingCollection&lt;E&gt; extends ForwardingObject implements Collection&lt;E&gt; {
  @Override protected abstract Collection&lt;E&gt; delegate();

  public boolean add(E element) {
    return delegate().add(element);
  }

  // ... (more overridden methods) ...
}
</pre>
<p>Things worth noting:</p>
<ul>
<li>This class overrides the <code>delegate()</code> method to return a <i>more specific</i> type.</li>
<li>The class contains no instance variables and <i>does not</i> have to be marked <a href="http://download.oracle.com/javase/1.5.0/docs/api/java/io/Serializable.html">Serializable</a>.</li>
</ul>
<p>Now, here&#8217;s an example of how to implement a decorator using a forwarding class (also from Google Collections):</p>
<pre class="brush: java; title: ; notranslate">
  public class ConstrainedCollection&lt;E&gt; extends ForwardingCollection&lt;E&gt; {
    private final Collection&lt;E&gt; delegate;
    private final Constraint&lt;? super E&gt; constraint;

    public ConstrainedCollection(Collection&lt;E&gt; delegate, Constraint constraint) {
      this.delegate = checkNotNull(delegate);
      this.constraint = checkNotNull(constraint);
    }
    @Override protected Collection&lt;E&gt; delegate() {
      return delegate;
    }
    @Override public boolean add(E element) {
      constraint.checkElement(element);
      return super.add(element);
    }
    @Override public boolean addAll(Collection&lt;? extends E&gt; elements) {
      return super.addAll(checkElements(elements, constraint));
    }
  }
</pre>
<p>This class implements a collection that checks a constraint on all elements added to the collection. Note that writing the decorator is easy given the <a href="https://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ForwardingObject.html">forwarding class</a> &#8212; only the methods relevant to the decorator need to be overridden.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2011/03/01/decorator-pattern-implementing-decorators-using-forwarding-classes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Palantir: search with a twist (part two: realtime indexing and security)</title>
		<link>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/</link>
		<comments>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 07:01:01 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1260</guid>
		<description><![CDATA[[A number of weeks ago, we published a post on the search technology used by Palantir. That post covered raising the memory efficiency of a couple of operations. This is part two of that series.] The most familiar use of search engines is to index documents made available on the Internet via the hypertext transfer [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px'><img src='/wp-content/uploads/2009/08/200px-magnifying_glass_icon.png' alt='magnifying glass'/></div>
<p><em>[A number of weeks ago, we published <a href="http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/">a post on the search technology</a> used by Palantir.  That post covered raising the memory efficiency of a couple of operations.  This is part two of that series.]</em></p>
<p>The most familiar use of search engines is to index documents made available on the Internet via the <a href="http://www.ietf.org/rfc/rfc2616.txt">hypertext transfer protocol</a>. Forgotten names like <a href="http://en.wikipedia.org/wiki/AltaVista">AltaVista</a>, names not-yet-really-learned like <a href="http://web.archive.org/web/20040828134017/http://www.bing.com/">Bing</a>, and, of course, <a href="http://infolab.stanford.edu/~backrub/google.html">Google</a> come to mind.</p>
<p>This one, massive use case has a couple of properties that I&#8217;d like to highlight:</p>
<ul>
<li>Asynchronous indexing and querying &#8211; web search engines tend to use crawlers and indexers to build up an index of the web.  After each crawl is finished, the new index is brought online for use by the query engine.</li>
<li>Lack of access controls &#8211; all the data in the index is available to any query.  In fact, most queries are (from the standpoint of the index) completely anonymous.</li>
</ul>
<h3>Palantir: not a web search engine</h3>
<p>Search technology is just one part of what makes up a Palantir system.  For us, it&#8217;s a way to quickly retrieve Palantir objects in a Palantir system, it&#8217;s not the whole of the application.</p>
<p>I&#8217;d like to highlight a couple of differences from the <a href="http://en.wikipedia.org/wiki/Web_search_engine">web search engine</a> case.  A Palantir system needs the following properties:</p>
<ul>
<li>Realtime indexing and querying &#8211; we need information to be available immediately as it changes in the system.</li>
<li>Leak-proof access controls &#8211; we need the search engine to help us make sure that we don&#8217;t have information leaking across access control boundaries.</li>
</ul>
<p>Hit the link to read more about these topics.<br />
<span id="more-1260"></span></p>
<h2>Realtime indexing</h2>
<p>The Palantir platforms implement realtime indexing: as soon as an analyst changes an object in the system, it needs to be available to query. This could be a change to data in the object or a change to the security tags on the object.</p>
<p>From a programming perspective, this is pretty straightforward: a Palantir transaction will not commit until the search engine is finished indexing the new data.</p>
<p>From a search engine operational perspective, this induces some challenges.  Asynchronous indexing allows the search engines to bring online a highly optimized static form of the index.  Contrast that with realtime indexing, where every cycle spent optimizing the index is removing cycles from serving other queries and there is likely a human waiting for the optimizing process to finish.</p>
<p>When using the static index, a query only accesses one, optimized index file which then points to the documents containing the results.  However, as changes and additions are indexed into the system, there is a lot of overhead to merging them into the master index.</p>
<p>Instead of merging and optimizing on every change, Lucene can keep around a number of smaller indexes that hold all the fresh entries.  These are fixed-size append-only segments that are much cheaper to write to than the optimized and merged form of the index. So basically, these &#8216;dynamic&#8217; indexes are linear lists of single-document indexes.  When the search engine goes to run a query it has to follow this simple (yet expensive) algorithm:</p>
<ul>
<li>Query the static, merged index, accumulating results. <i>(this part is reasonably fast)</i></li>
<li>For each of the dynamic indexes:
<ul>
<li>Open the file, incurring IO overhead.</li>
<li>Query each single-document index and look for additional records or newer records that supersede one of the existing found results.</li>
</ul>
</li>
</ul>
<p>You can see how the overhead of this can quickly get pretty large as the number of dynamic indexes grows: it grows linearly with number of new indexed records.  Compare that with the optimized index, which should be close to constant time for any given query.</p>
<p>To get around this, the indexer will only allow a certain number of these dynamic indexes to accumulate before it kicks off a background merge job.  During the merge job, we take a noticeable performance hit, but by batching up the merge run we amortize the overhead away for an overall performance win.  This hybrid mode didn&#8217;t require us to write any new code, but just to tune Lucene to give us the performance profile we wanted.</p>
<h2>Preventing Information Leaks</h2>
<p>The Palantir data platform has a fairly sophisticated security model baked in (see <em><a href="http://www.palantirtech.com/government/videos/whitevideos">The White Videos</em></a> for a more in-depth look at the security model).  One of the features that we have implemented is the ability to show a narrower view of an object based the user&#8217;s permissions: the user only sees the slice of the data that they have been granted access to.  Part of the complexity in implementing this was that we can&#8217;t even hint that the other, hidden data exists at all.</p>
<p>Search engines ranks their results by relevance, showing the matches to the query that it believes to be most relevant first.  One common way to make these relevance calculations is by comparing the length of the search term or phrase to the length of the term that it matched.  Consider the search term &#8216;king&#8217;: it will match the following phrases:</p>
<ul>
<li>&#8220;I&#8217;m the king of the world!&#8221;</li>
<li>&#8220;King salmon are often found in the Pacific Northwest and are also known as Chinook salmon.&#8221;</li>
<li>&#8220;Yes, my king.&#8221;</li>
</ul>
<p>Using a length-computed relevance, the phrase, &#8220;Yes, my king.&#8221; is the most relevant.</p>
<p>Getting back to the Palantir object model: for each distinct set of permissions that an object has, we compute a different object label based on the properties that are visible to that particular slice.  These multiple titles all go into the search engine.  If we were to compute relevance based on the length of the phrase that matched, and the shortest match on the object is shorter than the match that is actually visible to us, we could return the object with a higher-than-obvious relevance.  If we were to do that, we&#8217;d be leaking information, namely that there&#8217;s data on this object that the user making the query is not privy to. (Note that filtering of objects that aren&#8217;t at all visible to the user is done in a higher layer  after the results have been accumulated and ranked by the search engine.)</p>
<p>Given this problem, there are two approaches one can take:</p>
<ol>
<li>Store all the information needed to decide which labels are visible to the user running the query and then use only the visible labels when calculating the relevance of a match. Note that is a pretty expensive operation.</li>
<li>Don&#8217;t use the length of match to compute relevance. Note that skipping a relevance calculation is, obviously, a very cheap thing do.</li>
</ol>
<p>Which do we do?  Both.</p>
<p>When matching against object labels, the length metric actually lets us discern between better and worse matches. So in that case, we incur the cost of this calculation in order to return higher quality results.</p>
<p>However, when matching against things like document bodies, the ratio of the size of the match to the size of the search term starts to have less meaning but still has the possibility of leaking information in the query results.  For fields like this, we turn off the relevance calculations based on length of match. The upshot is the we don&#8217;t have to store the permissions information in the index nor incur the cost of the permissions/views calculation for these fields.</p>
<h2>A heartfelt thank you</h2>
<p>To be clear, this post highlights the ways in which our search code diverges from the main <a href="http://lucene.apache.org/java/docs/">Lucene</a> code base.  We&#8217;re huge fans of Lucene and have great respect for the developers that built and maintain what is probably the world&#8217;s greatest open-source search engine.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Palantir Finance Applied to Log4J Data</title>
		<link>http://blog.palantirtech.com/2009/08/26/palantir-finance-applied-to-log4j-data/</link>
		<comments>http://blog.palantirtech.com/2009/08/26/palantir-finance-applied-to-log4j-data/#comments</comments>
		<pubDate>Thu, 27 Aug 2009 07:22:42 +0000</pubDate>
		<dc:creator>Andrew C.</dc:creator>
				<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[user interface]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=948</guid>
		<description><![CDATA[In a previous post, Eric W. covered how we analyze polled system health information. Now we’ll look at pushed information, in the form of logging events.

We had two problems to solve: how to store structured data with a logging message, and how to analyze the collected data.

Analysis is the easy part: just use Palantir! More details below the fold.]]></description>
			<content:encoded><![CDATA[<p>In a <a href="http://blog.palantirtech.com/2009/02/23/palantir-monitoring-server-where-build-beats-buy/" mce_href="http://blog.palantirtech.com/2009/02/23/palantir-monitoring-server-where-build-beats-buy/">previous post</a>, Eric W. covered how we analyze polled system health information.  Now we&#8217;ll look at <em>pushed</em> information, in the form of logging events.</p>
<p><strong>Use Cases &amp; Constraints</strong></p>
<p>We decided on three kinds of questions we wanted to answer:</p>
<ul>
<li>What is the health of the deployment?
<ul>
<li>Example: What errors have occurred in the last 24 hours?</li>
</ul>
</li>
<li>Which parts of the platform are our users engaged with?
<ul>
<li>Example: How much time do users spend in each application?</li>
</ul>
</li>
<li>How is our server performing over time?
<ul>
<li>Example: What is the average wait on a search query?</li>
</ul>
</li>
</ul>
<p>The chief constraint was that we build our platform on <a href="http://logging.apache.org/log4j/" mce_href="http://logging.apache.org/log4j/">Log4J</a>. We already use Log4J all over the project, so converting the logging was out of the question.  Besides, Log4J provides a guideline for the kind of metadata our events should support, and Log4J makes it easy to record events to a database.</p>
<p>That left us with two problems to solve: how to store structured data with a Log4j message, and how to analyze the collected data.</p>
<p>Analysis is the easy part: just use Palantir!  After all, a sequence of logging events has a lot in common with a time series.  The rest is explained below.</p>
<p><span id="more-948"></span></p>
<p><strong>Recording Structured Data</strong></p>
<p>Consider the problem of plotting usage by a user on a given day.  The simplest approximation is to log an event every time an application is closed, and provide the time spent in the application with that event.  Posting the information as an unstructured String&#8211;e.g. “Andrew spent 46 seconds in Chart&#8221;–makes it difficult to later extract the data for analysis.  To solve this problem we developed the class <code>RichLogEntry</code>.</p>
<p><code>RichLogEntry</code>s contain a human-readable message and tagged data in the form of a set of key/value pairs, such as {duration, 46}, {user, Andrew}.  This adds to the up-front cost since log messages become more complex, but the benefit is that the analysis engine can easily and generically access data in events.</p>
<p>Furthermore, <code>RichLogEntry</code> plays nicely with existing Log4J infrastructure.  <code>Logger</code>s in Log4J already accept an arbitrary <code>Object</code> to pass to <code>Appender</code>s, and Log4J’s default <code>Appender</code>s call <code>toString()</code> on the <code>Object</code>s provided.  For <code>RichLogEntry</code> the <code>toString()</code> is simply the human-readable message. So a call to the logging framework with a <code>RichLogEntry</code> would look like (pseudo-Java):<br />
<code><br />
logger.info( (“Andrew just spent 46 seconds in Chart”,<br />
{“duration” : 46.0, “application” : “Chart”, “user” : “Andrew”}) );<br />
</code></p>
<p>For most <code>Appender</code>s this would produce the human-readable <code>String</code>, but our custom <code>Appender</code> knows how to store the tagged data for later analysis.</p>
<p><strong>Example: Application Usage</strong></p>
<p>We implemented the above (i.e. log a &#8220;duration&#8221; message each time a Palantir application loses focus), and hooked in the data with a Palantir Data Provider plugin.</p>
<p>In the image below, we&#8217;re using our <a href="http://www.palantirfinance.com/apps/instrument.html">Explorer</a> application to analyze the logging data.  Our filter framework combines filtering and visualization into a single application.  The image contains three filters from top to bottom, each containing a blue title bar.  The results of each filter are fed into the filter below it:</p>
<ul>
<li>The top filter separates messages by application and displays statistics for each.  We&#8217;ve selected the Explorer application, so its 144 messages will be fed into the next filter.</li>
<li>The middle filter has a histogram of the number of seconds each Explorer instance was active (in log scale). Each &#8220;bucket&#8221; represents a range of durations, and its height is the number of messages with that duration.  It looks like I usually spend around 10 seconds in Explorer before switching windows. We&#8217;re selecting the gray part of the histogram to avoid skewing our results for the times I&#8217;ve gone <a href="http://en.wiktionary.org/wiki/AFK" mce_href="http://en.wiktionary.org/wiki/AFK">AFK</a> with Palantir running.</li>
<li>The bottom filter counts the number of log events over time.</li>
</ul>
<p><a href="/wp-content/uploads/2009/08/andrew-usage.png" mce_href="/wp-content/uploads/2009/08/andrew-usage.png"><img class="size-full wp-image-1093" src="/wp-content/uploads/2009/08/andrew-usage.png" mce_src="/wp-content/uploads/2009/08/andrew-usage.png" alt="Andrew's InstrumentGroup usage." width="664" /></a></p>
<p>In <a href="http://www.palantirtech.com/finance">Palantir Finance</a>, filters such as these can be saved and used anywhere in the platform.  Let&#8217;s do that, and compare my usage to my coworker Eric L&#8217;s.  Creating a new set of filters for Eric is easy&#8211;I just modify a single filter above to specify him instead of me (the filter isn&#8217;t shown for simplicity&#8217;s sake), and then save a new copy.  Our <a href="http://www.palantirfinance.com/apps/chart.html">Chart</a> application is a good place to view the two series side by side:</p>
<p><a href="/wp-content/uploads/2009/08/usage-comp.png" mce_href="/wp-content/uploads/2009/08/usage-comp.png"><img class="size-full wp-image-1098" src="/wp-content/uploads/2009/08/usage-comp.png" mce_src="/wp-content/uploads/2009/08/usage-comp.png" alt="Andrew and Eric's usage." width="664" /></a></p>
<p>Of course, I&#8217;m the harder worker!</p>
<p><strong>Conclusion</strong></p>
<p>Our logging framework is complete, and we&#8217;ve found many new use cases. We use the framework to:</p>
<ul>
<li>analyze performance metrics across builds of both Palantir products</li>
<li>automatically compile usage reports on deployed installations</li>
<li>import and explore exotic event data sets by running the events through Log4J</li>
</ul>
<p>Building the Log4J analysis framework was valuable, fun and easy; and it demonstrates the flexibility of Palantir Finance for working with arbitrary data sets.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/08/26/palantir-finance-applied-to-log4j-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Palantir: search with a twist (part one: memory efficiency)</title>
		<link>http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/</link>
		<comments>http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/#comments</comments>
		<pubDate>Fri, 14 Aug 2009 07:53:59 +0000</pubDate>
		<dc:creator>Ari Gesher</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[software engineering]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=1088</guid>
		<description><![CDATA[A Palantir cluster seamlessly integrates many pieces of proven technology. One of them is our customized version of the venerable Java search engine, Lucene. Search engine technology tends to be optimized for the common use case of indexing web documents (or similar information architectures) where you have a few search terms in each query and [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px; margin-bottom: 15px'><img src='/wp-content/uploads/2009/08/200px-magnifying_glass_icon.png' alt='magnifying glass'/></div>
<p>A Palantir cluster seamlessly integrates many pieces of proven technology.  One of them is our customized version of the venerable Java search engine, <a href="http://lucene.apache.org/java/docs/">Lucene</a>. Search engine technology tends to be optimized for the common use case of indexing web documents (or similar information architectures) where you have a few search terms in each query and many, many documents as results. We want to leverage the <a href="http://en.wikipedia.org/wiki/Inverted_index">inverted index</a> capabilities of Lucene, but our data access patterns are a bit different than the typical use case:  we need things like pervasive range-querying, different types of relevance, and dynamic views of the data based on security constraints. So in building our data platform, we&#8217;ve run into some interesting challenges that are pretty unique in the information retrieval realm, specifically:</p>
<ol>
<li>Raising memory efficiency</li>
<li>Real-time indexing</li>
<li>Preventing information leaks across access boundaries in an efficient manner</li>
</ol>
<p>I&#8217;ll cover (1) in this post and (2) and (3) in a <a href="https://wp-admin-techblog.yojoe.local/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/">later post</a>, due out in about two weeks. <i>(Note: part 2 is available <a href="https://wp-admin-techblog.yojoe.local/2009/10/27/palantir-search-with-a-twist-part-two-realtime-indexing-and-security/">here</a>)</i></p>
<p>Hit the link and we&#8217;ll delve into this topic.<br />
<span id="more-1088"></span></p>
<h2>Raising memory efficiency</h2>
<p>We&#8217;ve addressed the issue of resource constraints, generally, in our earlier post: <a href="http://blog.palantirtech.com/2009/05/22/bandwidth-isnt-cheap-disk-isnt-cheap-cpu-isnt-cheap/"><em>Bandwidth isn’t cheap. Disk isn’t cheap. CPU isn’t cheap.</em></a> In that post, we posited &#8220;RAM to the rescue&#8221;:</p>
<blockquote><p>
On the other hand, some things in a SCIF are comparatively cheap. We never use boxes with less than 32GB of memory, and, in fact, lots of sites use 128GB of memory. RAM requires negligible power and cooling, and compared to disk, it’s relatively simple to install. It’s also easy to reconfigure the setup to use the additional memory.</p></blockquote>
<p>While this is true, no matter how much RAM you buy, your users will find a way to use it all &#8212; search is no exception.  In many of our environments, the search processes share hardware with other processes in the Palantir cluster, so while the OS may have 128 GB of RAM available, the search process&#8217;s VM has substantially less available to it. Compare this to a cluster of dedicated search nodes, where each node will have indexes sized to fit specifically into the memory available.</p>
<p>The upshot is that we needed to modify parts of <a href="http://lucene.apache.org/java/docs/index.html">Lucene</a> to deal with tighter memory constraints than it was designed for.</p>
<h3>Priority queue results accumulation</h3>
<p>Most systems that implement search include some notion of paging through the results.  We use a multi-level paging system, with the search server maintaining a server-side page for each query and serving smaller client-facing pages from.</p>
<p>Vanilla Lucene uses the following algorithm for accumulating search results:</p>
<ol>
<li>Load all matching results.</li>
<li>Sort by some relevance metric(s).</li>
<li>Return the top <i>n</i> results.</li>
</ol>
<p>The results are cached as a server-side page in case the client wants to load more than the first <em>n</em> results. You can see where this could run into trouble: if the total number of matching documents is high, that&#8217;s a lot of wasted RAM while we winnow it down to the size of the server page. So we use the following algorithm:</p>
<ol>
<li>Construct a <a href="http://en.wikipedia.org/wiki/Priority_queue">priority queue</a> of constrained size with priority computed using the chosen relevance metric</li>
<li>Stream through the results, inserting into the queue</li>
<li>Return the set of results in the priority queue</li>
</ol>
<p>Now we never need more RAM than the size of a server-side page to serve results.  The downside is that if the client wants more than one server-side page, we have to run the search &mdash; in its entirety &mdash; twice (ouch). To avoid the first set of results, we adjust the priority queue to kick out all results that were in the first page based on relevance metric.</p>
<h3>Using bitsets to optimize range queries</h3>
<p>A range query can return a result set of very high cardinality &ndash; a range is a very compact way of describing a large set of matching terms (even if they are discrete values, like dates).  One way to think about a range query of, say, <em>10 <= age <= 15</em>, is that it expands to <em>age = 10 OR age = 11 OR age = 12 OR age = 13 OR age = 14 OR age = 15</em>.  Rather than treat range queries in any special way, Lucene just does this expansion of the range and runs the query like a normal query.</p>
<div style='float: right; text-align: right; width: 315px; margin-top: 10px; margin-bottom: 10px;'><img src='/wp-content/uploads/2009/08/searchindexes1.png'/></div>
<p>Internally, Lucene stores a list of metadata nodes, ordered by document id, of each document that matches a given term.  The algorithm goes something like this:</p>
<ol>
<li>Open the document id lists for all matching terms</li>
<li>Walk the list pointers for each potential match such that you accumulate all the metadata for a given document.</li>
<li>Pass all this metadata up to the query processor which decides:
<ol>
<li>Does this document match the overall query? (remember that terms can be inverted)</li>
<li>Use term frequency taken from the metadata to calculate the relevance.</li>
</ol>
</ol>
<p>This structure and attendant algorithm has some nice properties:</p>
<ul>
<li>All documents are processed in a set order.</li>
<li>Everything is known about a document all at once.</li>
<li>It terminates in a single linear scan.</li>
</ul>
<p>&#8230; and has one very nasty property:</p>
<ul>
<li>All of the term value buckets that match the range must be open simultaneously.</li>
</ul>
<p>This is not a big deal for most English language queries.  However, for large ranges and the like, there can be thousands or even millions of terms.</p>
<p>The semantics of range queries have an interesting feature: a document that matches the range twice is not more relevant than one that matches once. (Contrast this with a simple term query: multiple matches <b>do</b> indicate higher relevance). Being able to discard the accounting of how many time we match the range leads to a huge win:</p>
<ol>
<li>We only need a single bit to represent a match</li>
<li>We can process a single term value bucket at a time instead of holding all buckets open in memory.</li>
</ol>
<p>Our search engine accumulates range queries into bitset objects, allowing for a very compact representation of results. We need much less memory than we did before since we only load one term value bucket at a time.  And the algorithm is simpler: no more walking pointers or <em>O(n)</em> check before figuring out which pointer moves next.</p>
<h2>The next episode</h2>
<p>Tune in for <em>Palantir: search with a twist (part two)</em> in a few weeks.  I&#8217;ll cover the following topics:</p>
<ul>
<li>Real-time indexing</li>
<li>Preventing information leaks across access boundaries in an efficient manner. (see Jason&#8217;s <a href='http://www.palantirtech.com/government/analysis-blog/mls'>Multi-Level Security</a> post over on the <a href="http://www.palantirtech.com/government/analysis-blog/">Palantir Government Analysis Blog</a> for a high-level look at why these feature are important. and check out <a href="http://www.palantirtech.com/government/videos/whitevideos">Bob McGrew&#8217;s &#8220;Access Control Model&#8221; White Video</a> for in-depth look at how we apply security to our object model.)
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/08/13/palantir-search-with-a-twist-part-one-memory-efficiency/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Palantir Config Server: lining up the ducks</title>
		<link>http://blog.palantirtech.com/2009/03/06/palantir-config-server-lining-up-the-ducks/</link>
		<comments>http://blog.palantirtech.com/2009/03/06/palantir-config-server-lining-up-the-ducks/#comments</comments>
		<pubDate>Fri, 06 Mar 2009 10:00:57 +0000</pubDate>
		<dc:creator>Khan Tasinga</dc:creator>
				<category><![CDATA[distributed systems]]></category>
		<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[problemspace-government]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=193</guid>
		<description><![CDATA[At Palantir, we build distributed software. When deployed at a customer site, our platform consists of several servers running on, and distributed across, a cluster of machines. When I first joined the company, deploying and managing our platform was tedious and time consuming. Need to install servers? One by one, login to the machines where [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; margin-left: 15px'><img src="/wp-content/uploads/2009/03/installpalantirservers.png" alt="" width="261" height="394" /></div>
<p>At Palantir, we build distributed software. When deployed at a customer site, our platform consists of several servers running on, and distributed across, a cluster of machines. When I first joined the company, deploying and managing our platform was tedious and time consuming. Need to install servers? One by one, login to the machines where they need to go, lay down their requisite files and manually configure them such that they can work together. Have to bring down a deployment for scheduled maintenance? One by one, and in the correct order, login to the machines where the servers reside and shut them down. Want to change the private keys and certificates used to secure communication between servers? Well, you get the point.</p>
<p>From a customer perspective, the complexity associated with the administration of distributed software represents a significant challenge. Not providing tools to help reduce that complexity impacted the overall usability of our platform. Furthermore, from a Palantir perspective, a non-trivial portion of our resources were being devoted to deploying and managing instances of our platform, both externally (by Forward Deployed Engineers working directly with our customers) and internally (by development, QA and support staff working to maintain and improve our product). Could we be more efficient? No doubt. Given our intense focus on customer satisfaction and the desire to grow / scale our business, action was necessary.</p>
<p>To see how we solved this problem, read on.<br />
<span id="more-193"></span></p>
<p>We stepped back a bit, taking time to reflect on our situation and understand the problem. Based upon our experience, what key areas would a solution need to address? We settled on the following:</p>
<ol>
<li><strong>Lifecycle management.</strong>
<ol>
<li>Ease initial deployment and upgrade.</li>
<li>Handle coordinated starting, stopping and restarting.</li>
</ol>
</li>
<li><strong>Configuration management.</strong>
<ol>
<li>Track which servers are installed on what machines.</li>
<li>Provide centralized management of server configuration information.</li>
</ol>
</li>
<li><strong>Automation.</strong>
<ol>
<li>Support encoding common management tasks based on best practices.</li>
</ol>
</li>
</ol>
<p>In addition to those three key areas, we also identified several important requirements. A couple that definitely warrant mention:</p>
<ol>
<li><strong>Security.</strong></li>
<li><strong>Extensibility.</strong></li>
</ol>
<p>After getting a good sense of what needed to be accomplished, we put effort into investigating if an existing solution would fit the bill. For a variety of reasons (i.e., available feature set, licensing constraints, etc.), we never found a good match. We did, however, come across several open source building blocks that could, when composed appropriately, combine to form the foundation of a homegrown solution. The Config Server was born.</p>
<h2>Architecture</h2>
<div class="postimg"><img src="/wp-content/uploads/2009/03/configserverarchitecture.png" alt="" width="650" /></div>
<p>The Config Server works with remote agents to enable centralized deployment management. The diagram presented above provides an overview of our management infrastructure. Below is a brief discussion of each key component of our architecture.</p>
<ul>
<li><strong>Agent</strong> &#8211; Agents are installed on every machine in a deployment. They are lightweight background processes that sit around waiting to execute commands submitted by the Config Server, interacting directly with the services installed on a given machine. Instead of implementing our own agent solution, we decided to leverage existing technology, the open source peer-to-peer <a rel="nofollow" href="http://staf.sourceforge.net/">Software Testing Automation Framework (STAF)</a>. From its homepage:<br />
<blockquote><p>The Software Testing Automation Framework (STAF) is an open source, multi-platform, multi-language framework designed around the idea of reusable components, called services (such as process invocation, resource management, logging, and monitoring). STAF removes the tedium of building an automation infrastructure, thus enabling you to focus on building your automation solution. The STAF framework provides the foundation upon which to build higher level solutions, and provides a pluggable approach supported across a large variety of platforms and languages.</p></blockquote>
<p>We added support for two-way SSL to STAF to enhance the security of our management infrastructure (specifically, to allow us to implement authorization based on self-signed certificates). But beyond that, no modification was necessary. STAF provides us with a robust solution for remote process invocation and file management, both absolutely essential for centralized deployment management.</li>
<li><strong>Agent Manager</strong> &#8211; The Agent Manager provides lifecycle and configuration management functionality for the agents in a deployment. It interacts with remote machines through SSH, using the open source <a rel="nofollow" href="http://www.trilead.com/Products/Trilead_SSH_for_Java/">Trilead SSH for Java</a> library.</li>
<li><strong>Config Registry</strong> &#8211; The Config Registry maintains and provides access to all of the information the Config Server has about a deployment. It consist of the following:
<ul>
<li><strong>Agent Registry</strong> &#8211; The Agent Registry contains information about all of the agents in a deployment.</li>
<li><strong>Service Registry</strong> &#8211; The Service Registry keeps track of all of the services in a deployment.</li>
<li><strong>Config Repository</strong> &#8211; The Config Repository is a central store for configurations of the agents and services in a deployment.</li>
<li><strong>Package Repository</strong> &#8211; The Package Repository holds all of the service packages that can be installed in a deployment.</li>
<li><strong>Plugin Repository</strong> &#8211; The Plugin Repository houses all of the plugins that are available for use in the Config Server. Plugins are used by the Security Manager, Service Manager and Task Manager.</li>
</ul>
</li>
<li><strong>Security Manager</strong> &#8211; We secure our servers and management infrastructure using public key cryptography. The Security Manager handles the generation and packaging of private keys and certificates. We perform private key and certificate generation using the <a rel="nofollow" href="http://www.bouncycastle.org/java.html">Bouncy Castle Crypto APIs for Java</a>. Packaging is taken care of by plugins in the Plugin Repository. For example, one plugin packages private keys and certificates into JKS files for use with Java, while another packages them into PEM files for use with OpenSSL.</li>
<li><strong>Service</strong> &#8211; Services represent the software installed on the machines in a deployment that drive our platform. They correspond to the servers we&#8217;ve built and the 3rd party offerings on which they depend (i.e., databases, entity extractors, etc.).</li>
<li><strong>Service Manager</strong> &#8211; The Service Manager interacts with agents to provide lifecycle and configuration management functionality for the services in a deployment. The actual mechanics of lifecycle and configuration management vary from to service to service. For example, starting service A might require invoking one script, while starting service B might require invoking another. For each type of service in a deployment, the Plugin Repository contains a corresponding plugin that embeds the necessary management logic. The Service Manager works with those plugins to get its job done.</li>
<li><strong>Task Manager</strong> &#8211; Managing a deployment requires performing tasks that go beyond lifecycle and configuration management for its constituent agents and services (i.e., log aggregation, database user creation, etc.). Such tasks are implemented as plugins. They make things happen by communicating with agents and / or directly with machines via SSH. The Task Manager interacts with the Plugin Manager to load tasks and coordinate their execution.</li>
</ul>
<h2>Functionality</h2>
<p>How did we do with respect to our stated needs?</p>
<ul>
<li><strong>Lifecycle management</strong> &#8211; The Agent Manager and Service Manager provide centralized lifecycle management. Initial deployment and upgrades, as well as starting, stopping and restarting servers, can all be handled directly through the Config Server.</li>
<li><strong>Configuration management</strong> &#8211; The Config Repository of the Config Server maintains information about deployments and provides centralized configuration management. The Agent Manager and Service Manager support the remote retrieval and application of agent and service configuration.</li>
<li><strong>Automation</strong> &#8211; The Config Server&#8217;s functionality is exposed via a clean and consistent Java API. Common management tasks can be automated by writing code against that API.</li>
</ul>
<p>And what about some of our more important requirements?</p>
<ul>
<li><strong>Security</strong> &#8211; All communication in our management infrastructure is secured using two-way SSL. A simple authorization mechanism, implemented using self-signed certificates, ensures that only the authorized entities (most notably, the Config Server), can execute commands through agents. Client access to the data maintained, and functionality exposed, by the Config Server requires password-based authorization.</li>
<li><strong>Extensibility</strong> &#8211; The Config Server can be extended to support new types of services and perform new tasks by implementing plugins and dropping them in the Plugin Repository.</li>
</ul>
<h2>Future</h2>
<p>In the space of a few months, we built the Config Server to address several key needs and requirements related to the management of our platform. Our work has already begun to pay dividends. Looking ahead, there are several things we would like to do:</p>
<ul>
<li>Add support for low-level system management and configuration related to our platform (i.e., user and group management, firewall configuration, etc.).</li>
<li>Implement multi-deployment management with support for features like staging, mirroring and migration.</li>
<li><a rel="nofollow" href="http://en.wikipedia.org/wiki/Autonomic_computing">Autonomic Computing</a>, integrating with our monitoring solution to implement platform self-management.</li>
</ul>
<p>While we&#8217;ve accomplished a fair amount, plenty of work remains. We look forward to enhancing our Config Server and its associated infrastructure as we strive to make our platform one that is not only powerful and a pleasure to use, but also easy to manage and maintain.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/03/06/palantir-config-server-lining-up-the-ducks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Palantir Monitoring Server: where build beats buy</title>
		<link>http://blog.palantirtech.com/2009/02/23/palantir-monitoring-server-where-build-beats-buy/</link>
		<comments>http://blog.palantirtech.com/2009/02/23/palantir-monitoring-server-where-build-beats-buy/#comments</comments>
		<pubDate>Mon, 23 Feb 2009 20:00:56 +0000</pubDate>
		<dc:creator>Eric Wong</dc:creator>
				<category><![CDATA[enterprise engineering]]></category>
		<category><![CDATA[problemspace-government]]></category>
		<category><![CDATA[software engineering]]></category>

		<guid isPermaLink="false">http://blog.palantirtech.com/?p=186</guid>
		<description><![CDATA[Distributed systems are complex. Getting them right is hard, and when things don&#8217;t go right, it can be difficult to understand what went wrong. In an environment like ours, a good monitoring system isn&#8217;t just nice to have; it&#8217;s a critical component necessary for understanding behavior and diagnosing problems. We had three primary goals for [...]]]></description>
			<content:encoded><![CDATA[<div style='float: right; text-align:right; margin-right:20px; width: 253px'><img src="http://blog.palantirtech.com/wp-content/uploads/2009/02/monitoringserverscreenshot-badge.png" alt="Graph of CPU usage over time" title="Graph of CPU usage over time" width="233" height="188"/></div>
<p>Distributed systems are complex. Getting them right is hard, and when things don&#8217;t go right, it can be difficult to understand what went wrong. In an environment like ours, a good monitoring system isn&#8217;t just nice to have; it&#8217;s a critical component necessary for understanding behavior and diagnosing problems.</p>
<p>We had three primary goals for the initial monitoring system: <b>graphing</b> of time-series data, <b>alerting</b> on event triggers, and <b>notifications</b> to users.  Furthermore, as a product company, we had a design goal of a simple, intuitive (yet powerful and flexible) solution.</p>
<p>Before starting, we did a quick survey of existing open-source packages. Unfortunately, nothing we found quite fit our needs, given our specific requirements of security, protocol, licensing, and integrability into our product. Given that, we made the decision to forge ahead and build our own; we try not to re-invent the wheel but it seemed to make sense here.</p>
<p>For an in-depth look at the architecture of the Monitoring Server and components we used to build it, read on&#8230;</p>
<p><span id="more-186"></span></p>
<h2>Architecture</h2>
<p>At the highest level, a two-tiered architecture made the most sense. The back-end, standalone server component would be responsible for collecting, processing, and exposing data through an API. The front-end component would be web-based <a href="http://en.wikipedia.org/wiki/Portlet">portlets</a> integrated into our existing management interface.</p>
<p>The server architecture was designed to allow generic components to work together, with everything connected up via Spring.  While we started with JMX as our collection method for monitoring data, the architecture sees this as just one pluggable component, with multiple data backends supported.  A Spring webservices API allows the front-end portlets to query and manipulate the components at each level.</p>
<p>For our first shipping release, we&#8217;ve only shipped the JMX backend, and so this is what production architecture looks like for now:</p>
<div class='postimg'>
<img src="http://blog.palantirtech.com/wp-content/uploads/2009/02/monitoring-server-architecture.png" alt="Monitoring Server architecture diagram" title="Monitoring Server architecture diagram" width="650" /></div>
<h2>Components</h2>
<p>Any time you choose build instead of buy, there&#8217;s a lot of work to be done to get the full set of functionality you need. Fortunately, the Java platform has an extremely rich set of freely available projects and libraries, and we leveraged many of them for the back-end:</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Java_Management_Extensions">JMX</a>: the core of our system, the Java Management Extensions is a standard for managing and monitoring applications. We use JMX to instrument and monitor our own servers, and because it&#8217;s an adopted standard, we gain access to MBeans exposed by third-party components as well.</li>
<li><a href="https://rrd4j.dev.java.net/">rrd4j</a>: round-robin databases (RRDs) are an excellent storage format for time-series data, and RRD4J is a pure Java implementation of the legendary RRDTool. The round-robin format allows for a fixed size file, since older data is overwritten as newer data arrives. The multi-resolution aspect of the files provides long historical views without a space premium. For example, an RRD can contain a high resolution series for recent information and a low resolution series for long-term data.</li>
<li><a href="http://hsqldb.org/">HSQLDB</a>: a lightweight, native Java, SQL database that can be run in-process. We use HSQLDB to store all non&#8211;time-series information, such as metadata about metrics we&#8217;re monitoring.</li>
<li><a href="http://www.opensymphony.com/quartz/">Quartz</a>: an open source job scheduling system, we use Quartz primarily for scheduling Alerts. Alerts run periodically to check for a condition, and notify if triggered. Each Alert&#8217;s wait period is specified by the user, and fortunately, with Quartz it&#8217;s easy to schedule many Alerts at different frequencies.</li>
<li><a href="http://groovy.codehaus.org/">Groovy</a>: self-described as &#8220;an agile dynamic language for the Java Platform,&#8221; Goovy is integrated into our alerting system. Alerts can contain Groovy scriptlets, which give us the expressiveness to create Alerts such as &#8220;alert if a metric&#8217;s average value over the past 5 minutes is greater than X,&#8221; or &#8220;alert if the variation of a set of metrics&#8217; values across all servers of type Y is greater than Z.&#8221;</li>
<li><a href="http://java.sun.com/products/javamail/">JavaMail</a>: a full-featured email framework. Supports SSL/TLS secure connection protocols, which our clients require.</li>
<li><a href="http://java.sun.com/developer/technicalArticles/WebServices/jaxb/">JAXB</a>: a simple-to-use Java to XML API, JAXB allows us to convert XML into Java objects (and vice-versa). We use JAXB for parsing configuration files and persisting objects into HSQLDB.</li>
<li><a href="http://www.theserverside.com/tt/articles/article.tss?l=IntrotoSpring25">Spring</a>: a framework for developing enterprise Java applications, Spring is the foundation for our monitoring server.
<p>Having never used a component framework before, using Spring&#8217;s Inversion of Control and Dependency Injection paradigms to build an application turned out to be a pleasant and educational experience. While it enforced discipline in using interfaces, it rewarded us with the ability to easily swap implementations of a component. For example, switching to a HSQLDB-based data store required only a single-line edit, and everything just worked. Seriously.</p>
<p>
We also leveraged Spring early in our development process: we pair-coded interfaces, created stub objects, and then wired everything up in Spring. Once our skeleton was in place, we independently worked on component implementations and swapped them in as they were completed. Later in the cycle, we used Spring in our unit tests to compose our application differently for specific tests, isolating important functionality and using dummy components for non-relevant areas.</p>
</li>
</ul>
<h2>User Interface</h2>
<p>By moving the user interface into the portlets, we were able to re-skin the fairly ugly native graphing capability that rrd4j provides with a more generic solution that looks good.  For comparison, here&#8217;s an MRTG style graph produced by rrd4j:</p>
<div class='postimg'><img src="https://rrd4j.dev.java.net/tutorial_files/speed4.gif" title='rrd4j sample graph' alt='rrd4j sample graph'/></div>
<p>And here&#8217;s some graphs from our Monitoring Server (note the portlet UI components for controlling display of the graphs):</p>
<div class='postimg'><img src="http://blog.palantirtech.com/wp-content/uploads/2009/02/monitoringserverscreenshot.png" alt="Graphs from Monitoring Server" title="Graphs from Monitoring Server"/></div>
<p>While the difference is not that stark, our graphs are much easier on the eyes.</p>
<h2>Monitoring Server: present and future</h2>
<p>We recently released the monitoring system, and it&#8217;s already providing insights into our product&#8217;s behavior.  We have more features planned: <b>eventing</b>, which will help us track system events such as a server restart or job completion; <b>generating</b> new time-series data from existing data (for example, a series of the rolling standard deviation of a metric, or the number of failure events in the past 24 hours), and Groovy <b>scripting</b> directly against the monitoring server.  The last feature is particularly helpful when our engineering team can&#8217;t physically access a system due to security restrictions.</p>
<p>From an analysis perspective, we can now start to better understand our system&#8217;s behavior, which will help us identify problems before they occur and help steer our development energy going forward. Even the world&#8217;s best data analysis software needs a little analysis itself sometimes.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.palantirtech.com/2009/02/23/palantir-monitoring-server-where-build-beats-buy/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

