Archive for the ‘distributed systems’ Category

Bandwidth isn’t cheap. Disk isn’t cheap. CPU isn’t cheap.

May 22nd, 2009 | Bob

fake clearance screen

At Palantir, we work in Silicon Valley, read High Scalability, and think of web companies like Facebook and Google as our peers. Most of the time, this is exactly the right recipe for bringing disruptive innovation into the intelligence community. Sometimes, though, it’s misleading – when discussing a design decision, it’s received knowledge that “Disk is cheap.” or “CPU is cheap”. For a web company with a deployment in a commercial data center (or its own data center), this received knowledge is correct. But for a company that ships distributed systems instead of hosting them, and for whom the deployment environment is the kind of locked-down server room in which classified data can reside, these assumptions couldn’t be more false.

At Palantir, we are almost never able to host our customers’ data – typically, as the data is very sensitive, we are not even allowed to see it! Our customers’ highly sensitive data has to reside in a Secure Compartmented Information Facility or SCIF – a building which has been built to be resistant to attempts to access the information within, whether through active or passive measures. The network inside a SCIF is physically separated – “airgapped” – from the public Internet to prevent information leakage. As the entire rationale for such facilities is to prevent information leakage, moving information into or out of one is a tightly regulated process, almost always requiring a human to be in the loop.
Read the rest of this entry »

Palantir Config Server: lining up the ducks

March 6th, 2009 | Khan

At Palantir, we build distributed software. When deployed at a customer site, our platform consists of several servers running on, and distributed across, a cluster of machines. When I first joined the company, deploying and managing our platform was tedious and time consuming. Need to install servers? One by one, login to the machines where they need to go, lay down their requisite files and manually configure them such that they can work together. Have to bring down a deployment for scheduled maintenance? One by one, and in the correct order, login to the machines where the servers reside and shut them down. Want to change the private keys and certificates used to secure communication between servers? Well, you get the point.

From a customer perspective, the complexity associated with the administration of distributed software represents a significant challenge. Not providing tools to help reduce that complexity impacted the overall usability of our platform. Furthermore, from a Palantir perspective, a non-trivial portion of our resources were being devoted to deploying and managing instances of our platform, both externally (by Forward Deployed Engineers working directly with our customers) and internally (by development, QA and support staff working to maintain and improve our product). Could we be more efficient? No doubt. Given our intense focus on customer satisfaction and the desire to grow / scale our business, action was necessary.

To see how we solved this problem, read on.
Read the rest of this entry »


Palantir