Archive for the ‘softwarephilosophy’ Category

A rigorous friction model for human-computer symbiosis

June 2nd, 2010 | Asher

This is a response to Ari’s awesome post on human-computer symbiosis. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible… let’s take a look:

We are attempting to understand the total analytic capability for a given task a of a human-computer team. Analytic capability in this case probably means:

eq1(1)

Where A is the answer to the analytic problem in question and tA is the time needed to arrive at the answer based on the inputs available. In the case of chess, A could be the optimum next move given all previous information and tA would be how long it takes to decide on this move.

Read on for a look at how this generalizes in human-computer symbiotic systems.
Read the rest of this entry »

Friction in Human-Computer Symbiosis: Kasparov on Chess

March 8th, 2010 | Ari

As we build our platforms and applications following a human-computer symbiosis approach, we keep an ear to the ground for interesting examples that illuminate new techniques or validate our approach in some empirical way.

One of the areas that we’re interested is in the overall friction of analysis systems. The systems that we build are built on commodity hardware — we’re not building faster computers and yet we can deliver orders-of-magnitude better performance on analysis tasks than existing solutions. How do we do this? By building software in such a way that it reduces the friction experienced at the boundaries between the computing power, the analyst, and the source data.

Chess as analysis laboratory

Chess is, at its heart, a predictive venture. The player attempts to anticipate their opponent’s moves, planning their own moves accordingly, with the straightforward goal of finding a sequence of piece moves that force checkmate.

This game is, in its ideal form, analysis. (The moves made are the logical extension of the analysis.) The data are clean, the problem is well-defined and everyone plays by the same rules. There are even well-defined metrics for ranking chess players by skill — a better chess player is a better chess-game analyst.

In the realm of evaluation of analysis systems, this is as about as good as it gets in terms of designing controlled experiments to study the relative strengths of different analysis systems.

Garry Kasparov, widely considered to be the greatest chess player of all time, recently wrote a review of Diego Rasskin Gutman’s book, Chess Metaphors: Artificial Intelligence and the Human Mind.

The review is excellent and covers a lot of ground. However, one particular anecdote stood out as a very interesting example of human-computer symbiosis (emphasis added):

In 2005, the online chess-playing site Playchess.com hosted what it called a “freestyle” chess tournament in which anyone could compete in teams with other players or computers. Normally, “anti-cheating” algorithms are employed by online sites to prevent, or at least discourage, players from cheating with computer assistance. (I wonder if these detection algorithms, which employ diagnostic analysis of moves and calculate probabilities, are any less “intelligent” than the playing programs they detect.)

Lured by the substantial prize money, several groups of strong grandmasters working with several computers at the same time entered the competition. At first, the results seemed predictable. The teams of human plus machine dominated even the strongest computers. The chess machine Hydra, which is a chess-specific supercomputer like Deep Blue, was no match for a strong human player using a relatively weak laptop. Human strategic guidance combined with the tactical acuity of a computer was overwhelming.

The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

After the jump, we look at this finding in a more generalized way and map it onto the Palantir approach.
Read the rest of this entry »

Palantir: like an operating system for data analysis

November 6th, 2009 | Ari

If you’ve taken the time to peruse the Palantir Government analysis blog, you’ve seen numerous examples of Palantir Government as applied to interesting problems; they are recorded screen captures of our analysis desktop client. It’s a showcase of useful, meaningful, and compelling visual and semantic tools being used to do analysis on a wide range of datasets.

What enabled this analysis? Aside from the obvious hard work of our UI and analysis tools teams, it’s the flexibility and power of the Palantir data platform. More than just a scalable datastore, the Palantir data platforms act as robust and clean abstractions on top of data.

One of the early architecture decisions that we made when building both Palantir Government and Palantir Finance was to separate the respective data platforms from the end-user applications used to actually perform analysis. More than just following the client-server model, this separation made the data servers in both products into generic intelligence infrastructure for analytic problems, with our clients acting as analysis applications on top of those platforms.

And so, one way to look at our data platform is as an operating system for analytic applications. In this post we’ll explore the history of operating systems, understand why they’re so important and see how the Palantir data servers deliver the same potential to revolutionize the writing of analysis software that operating systems did to the writing of general programs for computers.

Read the rest of this entry »

Deploying a distributed system

October 7th, 2008 | Bob

Distributed systems diagram

At Palantir, we write software that gets deployed at each client, integrated across their sensitive data sets, and maintained and administered by that client’s in-house admins. Most deployed enterprise software is run on a single beefy box: consider wikis, blogging systems, bug tracking systems, or practically any client/server or web client software software used today. On the other hand, most enterprise software that runs as a distributed system is hosted: Salesforce.com, Google Apps, or any approach that sells software as a service. What’s fairly unusual about our software is that it’s deployed as a distributed system at each client.

Distributed systems are hard to build and hard to maintain. As long as that distributed system is built and maintained in-house, however, you have a number of advantages:

  • The administrators are full-time product experts who are focused on the mission of keeping your system available and responsive.
  • The development organization can build internal tools for the administrators that only have to be “good enough” and can step in if necessary.
  • It’s easy to get feedback on how the system performs, because there are no sensitivity, privacy, or legal constraints.
  • A single, large deployment allows you to optimize your hardware purchasing and amortize installation headaches across a large number of machines.

This is all great, of course, and if you can host and maintain your distributed system yourself, I’d highly recommend it. Sometimes, however, it’s just not possible. At Palantir, the client data we work with is so sensitive that even we cannot see it, except under very strictly controlled circumstances. It’s also so large that the bandwidth limitations of pushing it into a system hosted by us would be prohibitive.

So suppose that you have to deploy your distributed system in a customer datacenter with external parties maintaining the system. What do you need to consider? In this post, I’ll go into a number of key points that we have faced and addressed at Palantir.

Read the rest of this entry »

Palantir: so what is it you guys do?

December 4th, 2007 | Kevin

I often ask candidates if they’re familiar with what we do at Palantir. Most people think they are. “Oh, you’re that data viz. company,” or, worse, “You guys do data mining, right?” At least they’ve heard of us and at least they’re on the right track, but I cringe anyway. We aren’t just a “data visualization” company and we don’t do “data mining.” It’s almost impossible to convey the scope and complexity of what we do in a few short minutes—or to do so without taking the conversation to an eye-glazing level of abstraction.

The following is my attempt at describing what we do at a high level without oversimplifying. I hope that after reading this a candidate will ‘get’ what we’re about, or at least understand enough not to apply tiny labels to our expansive vision.

Read the rest of this entry »

Palantir: embodying a 50-year-old vision of the future?

March 16th, 2007 | Ari

Here at Palantir, Charles Cooper’s recent piece on CNET about J. C. R. Licklider has struck us as a very timely piece of journalism.

Licklider was an very influential man, with Cooper even crediting him for the existence of Computer Science as a modern-day field:

Until Licklider began his work at ARPA, there were no Ph.D. programs in computer science at American universities. That changed after ARPA began handing out grants to promising students, a practice that convinced MIT, Stanford, UC Berkeley and Carnegie Mellon to start their own graduate programs in computer science in 1965. Maybe that should go down as Licklider’s most lasting legacy.

In the piece, Cooper references this influential and well known work by Licklider: Man-Computer Symbiosis, by J. C. R. Licklider, published in IRE Transactions on Human Factors in Electronics, volume HFE-1, pages 4-11, March 1960.

That’s right, it was written almost 50 years ago. That said, it’s incredibly relevant today, perhaps more than ever.

Here’s the abstract:

Man-computer symbiosis is an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs. In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking. Preliminary analyses indicate that the symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them. Prerequisites for the achievement of the effective, cooperative association include developments in computer time sharing, in memory components, in memory organization, in programming languages, and in input and output equipment.

This description is still a pretty accurate description of how most analysts (in any industry or field) go about their business:

Despite the fact that there is a voluminous literature on thinking and problem solving, including intensive case-history studies of the process of invention, I could find nothing comparable to a time-and-motion-study analysis of the mental work of a person engaged in a scientific or technical enterprise. In the spring and summer of 1957, therefore, I tried to keep track of what one moderately technical person actually did during the hours he regarded as devoted to work. Although I was aware of the inadequacy of the sampling, I served as my own subject.

About 85 per cent of my “thinking” time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it. Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so.

Throughout the period I examined, in short, my “thinking” time was devoted mainly to activities that were essentially clerical or mechanical: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability.

This quote is an eerily accurate description of how trading strategies are formulated, back-tested, and implemented these days. As analogy, it’s also an accurate reflection of the modern use of information processing in the intelligence space.

To wit:

It is to bring computing machines effectively into processes of thinking that must go on in “real time,” time that moves too fast to permit using computers in conventional ways. Imagine trying, for example, to direct a battle with the aid of a computer on such a schedule as this. You formulate your problem today. Tomorrow you spend with a programmer. Next week the computer devotes 5 minutes to assembling your program and 47 seconds to calculating the answer to your problem. You get a sheet of paper 20 feet long, full of numbers that, instead of providing a final solution, only suggest a tactic that should be explored by simulation. Obviously, the battle would be over before the second step in its planning was begun. To think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own will require much tighter coupling between man and machine than is suggested by the example and than is possible today.

So what how does this relate to what we do? In the finance world, much of what fund managers and analysts do in building strategies has to do with formulating trading models and then building spreadsheets that can back test or simulate the performance of those models.

Our finance tool obviates the need for this “clerical, mechanical” work, allowing strategists to spend more time making sense of the interconnections in the market and formulating nuanced trading strategies and less time doing model-building in Excel. We take the state-of-art a quantum leap forward in terms of financial analysis: rather than even just allowing analysts to quickly build models and back test trading strategies, we’ve built a tool that allows for a smooth flow from hypothesis to theory with the software doing all the heavy lifting, data wrangling, eye-candy-class presentation. New variables or market conditions can be incorporated on the fly without the need for a pause from high-level thinking to gather data or marshal it into the right format. Knowledge can be divined by asking questions relative to high-level concepts of things like dynamic market conditions and meta-conditions like the volatility-of-volatility.

The question has traditionally been, “How do I effectively model this financial space?” With Palantir, we’re transforming that question into the core question asked in the finance industry, namely, “How can I better understand the interactions at work in today’s markets?” So the focus moves to the human-level questions while the software takes care of the data level machinations.

In the intelligence space, the composite views of data that the government team creates save the analysts from having to painstakingly research and record correlations across multiple informational domains. Instead, the analyst can spend time divining the meaning behind the connections and correlations. Our take on perpetual analytics takes things a step further, alerting the analyst as relevant new information enters the system. And finally, we’re building workflows that allow analysts to quickly attach ‘handles’ to data to allow what has been traditionally unstructured data get seat at this table of computer-enhanced human analysis.

We’re speeding up the process of analysis by creating an analyst-computer symbiosis. No longer will people need to spend time doing menial data processing, the computers will do it for them, while the humans provide the spark of insight, semantics, and cognition that computers lack.

It’s conceptual analysis at the speed of thought.

This is why I’m excited to come to work every day: we’re building the software that embodies a broad vision of the future. This vision of human computer symbiosis dates from five decades ago but is also apparent in every interaction we see with computers on the big and small screens (no, not our monitors). From Star Trek to 24, people want to the computers to do the repetitive and time-consuming simple work but let them have final say on any complex decisions. As one of our customers told us when shown our application: this is the future.

Thoughts on security

February 14th, 2007 | Ari

A quick read on the pitfalls of designing computer security, The Six Dumbest Ideas In Security:

Let me introduce you to the six dumbest ideas in computer security. What are they? They’re the anti-good ideas. They’re the braindamage that makes your $100,000 ASIC-based turbo-stateful packet-mulching firewall transparent to hackers. Where do anti-good ideas come from? They come from misguided attempts to do the impossible – which is another way of saying “trying to ignore reality.” Frequently those misguided attempts are sincere efforts by well-meaning people or companies who just don’t fully understand the situation, but other times it’s just a bunch of savvy entrepreneurs with a well-marketed piece of junk they’re selling to make a fast buck. In either case, these dumb ideas are the fundamental reason(s) why all that money you spend on information security is going to be wasted, unless you somehow manage to avoid them.

A well-written piece that’s worth reading for anyone that’s implementing computer security, either at an operational level or as a software engineer.


Palantir