Archive for the ‘enterprise engineering’ Category

Palantir: search with a twist (part two: realtime indexing and security)

October 27th, 2009 | Ari

magnifying glass

[A number of weeks ago, we published a post on the search technology used by Palantir. That post covered raising the memory efficiency of a couple of operations. This is part two of that series.]

The most familiar use of search engines is to index documents made available on the Internet via the hypertext transfer protocol. Forgotten names like AltaVista, names not-yet-really-learned like Bing, and, of course, Google come to mind.

This one, massive use case has a couple of properties that I’d like to highlight:

  • Asynchronous indexing and querying – web search engines tend to use crawlers and indexers to build up an index of the web. After each crawl is finished, the new index is brought online for use by the query engine.
  • Lack of access controls – all the data in the index is available to any query. In fact, most queries are (from the standpoint of the index) completely anonymous.

Palantir: not a web search engine

Search technology is just one part of what makes up a Palantir system. For us, it’s a way to quickly retrieve Palantir objects in a Palantir system, it’s not the whole of the application.

I’d like to highlight a couple of differences from the web search engine case. A Palantir system needs the following properties:

  • Realtime indexing and querying – we need information to be available immediately as it changes in the system.
  • Leak-proof access controls – we need the search engine to help us make sure that we don’t have information leaking across access control boundaries.

Hit the link to read more about these topics.
Read the rest of this entry »

Palantir Finance Applied to Log4J Data

August 26th, 2009 | Andrew C.

In a previous post, Eric W. covered how we analyze polled system health information. Now we’ll look at pushed information, in the form of logging events.

Use Cases & Constraints

We decided on three kinds of questions we wanted to answer:

  • What is the health of the deployment?
    • Example: What errors have occurred in the last 24 hours?
  • Which parts of the platform are our users engaged with?
    • Example: How much time do users spend in each application?
  • How is our server performing over time?
    • Example: What is the average wait on a search query?

The chief constraint was that we build our platform on Log4J. We already use Log4J all over the project, so converting the logging was out of the question. Besides, Log4J provides a guideline for the kind of metadata our events should support, and Log4J makes it easy to record events to a database.

That left us with two problems to solve: how to store structured data with a Log4j message, and how to analyze the collected data.

Analysis is the easy part: just use Palantir! After all, a sequence of logging events has a lot in common with a time series. The rest is explained below.

Read the rest of this entry »

Palantir: search with a twist (part one: memory efficiency)

August 13th, 2009 | Ari

magnifying glass

A Palantir cluster seamlessly integrates many pieces of proven technology. One of them is our customized version of the venerable Java search engine, Lucene. Search engine technology tends to be optimized for the common use case of indexing web documents (or similar information architectures) where you have a few search terms in each query and many, many documents as results. We want to leverage the inverted index capabilities of Lucene, but our data access patterns are a bit different than the typical use case: we need things like pervasive range-querying, different types of relevance, and dynamic views of the data based on security constraints. So in building our data platform, we’ve run into some interesting challenges that are pretty unique in the information retrieval realm, specifically:

  1. Raising memory efficiency
  2. Real-time indexing
  3. Preventing information leaks across access boundaries in an efficient manner

I’ll cover (1) in this post and (2) and (3) in a later post, due out in about two weeks.

Hit the link and we’ll delve into this topic.
Read the rest of this entry »

Palantir Config Server: lining up the ducks

March 6th, 2009 | Khan

At Palantir, we build distributed software. When deployed at a customer site, our platform consists of several servers running on, and distributed across, a cluster of machines. When I first joined the company, deploying and managing our platform was tedious and time consuming. Need to install servers? One by one, login to the machines where they need to go, lay down their requisite files and manually configure them such that they can work together. Have to bring down a deployment for scheduled maintenance? One by one, and in the correct order, login to the machines where the servers reside and shut them down. Want to change the private keys and certificates used to secure communication between servers? Well, you get the point.

From a customer perspective, the complexity associated with the administration of distributed software represents a significant challenge. Not providing tools to help reduce that complexity impacted the overall usability of our platform. Furthermore, from a Palantir perspective, a non-trivial portion of our resources were being devoted to deploying and managing instances of our platform, both externally (by Forward Deployed Engineers working directly with our customers) and internally (by development, QA and support staff working to maintain and improve our product). Could we be more efficient? No doubt. Given our intense focus on customer satisfaction and the desire to grow / scale our business, action was necessary.

To see how we solved this problem, read on.
Read the rest of this entry »

Palantir Monitoring Server: where build beats buy

February 23rd, 2009 | Eric W.

Graph of CPU usage over time

Distributed systems are complex. Getting them right is hard, and when things don’t go right, it can be difficult to understand what went wrong. In an environment like ours, a good monitoring system isn’t just nice to have; it’s a critical component necessary for understanding behavior and diagnosing problems.

We had three primary goals for the initial monitoring system: graphing of time-series data, alerting on event triggers, and notifications to users. Furthermore, as a product company, we had a design goal of a simple, intuitive (yet powerful and flexible) solution.

Before starting, we did a quick survey of existing open-source packages. Unfortunately, nothing we found quite fit our needs, given our specific requirements of security, protocol, licensing, and integrability into our product. Given that, we made the decision to forge ahead and build our own; we try not to re-invent the wheel but it seemed to make sense here.

For an in-depth look at the architecture of the Monitoring Server and components we used to build it, read on…

Read the rest of this entry »


Palantir