Archive for the ‘coding’ Category

JavaInvoke allows you to spawn additional Java VMs during testing

July 28th, 2009 | Ari Gesher

junit success

Here at Palantir we use test-driven development (or TDD for short). Integrated tools like Eclipse and JUnit simplify writing and running unit tests. However, once you need to test a broader swath of functionality, it’s time to write functional, integration, and system tests. While technically not ‘unit testing’, the testing framework that JUnit provides is basically the same infrastructure that you want to leverage for writing these more involved types of testing.

When you’re developing enterprise software, functional testing often means getting your clients to talk to your servers. For the main Palantir Government product, we integrate the process of bringing the server up and down with the Ant scripts that run our automated unit tests: our testing tasks bring up the server, run the test suite, and then kill the server. This works great and produces nice results.

When I started working on our authentication server, the pattern that we had used before didn’t work for me. While the Palantir Government tests ran with a single, static configuration file, I needed to run the authentication server with multiple configurations in the course of running through the all the different functional tests. I determined that I needed a way to programmatically bring the server up and down for testing. In JUnit parlance, I needed a way to programmatically launch the server component as part of my setup() function for my unit tests and stop it in my teardown().

With my itch-to-scratch firmly in hand (or some other mixed metaphor), I set out to figure out how to invoke new Java processes from inside a unit test. The solution I came up with (with source code and examples) after the jump.
Read the rest of this entry »

The MultiSnake Challenge

July 6th, 2009 | Nick Miyake

multisnake game

“Freaking lag!” It had started to become a common refrain around the developer pit. Listed as a project on a candidate’s resume, MultiSnake was a game that we had started to play during our coding breaks. The game was really quite fun — it was easy to play, games were short, and its multi-player nature fostered great competition. The only real drawback was that we seemed to experience network lag. There was nothing more infuriating than having your long snake die by running straight into a completely avoidable wall because the game lagged and didn’t respond to your keyboard commands in time. During one of our particularly lag-heavy games, someone yelled out a gripe that would change our MultiSnaking days for good: “Man, we could totally write this game ourselves, in our app.”

Read the rest of this entry »

Data Model Change Eventing

May 27th, 2009 | Derek Cicerone

One of the early architectural challenges that we faced in building the Palantir Finance product was coming up with a good design for firing events from data models to their listeners. There are many different concepts in our product such as charts, portfolios, and indices which are all maintained by different developers. Initially, each developer had their own system for firing events when a data model changed. This quickly became a drag on development as tools became more integrated because we had to learn each others’ event methodologies and translate between the different systems.

The solution was to select a single event firing system. We wanted something that was easy-to-use yet powerful enough to express all the changes that might be made to a data model. Java’s Property Change Support (PCS) was a good fit because it can support arbitrary events in a very lightweight fashion.

Read on for details of our implementation…
Read the rest of this entry »

Model-View-Adapter

April 20th, 2009 | Kevin Simler

I used to think I understood MVC. In undergraduate CS programs, MVC is taught as an off-the-shelf pattern, explained once and then ready for use in the real world. Wikipedia also makes it seem pretty simple:

Model–View–Controller (MVC) is an architectural pattern used in software engineering. Successful use of the pattern isolates business logic from user interface considerations, resulting in an application where it is easier to modify either the visual appearance of the application or the underlying business rules without affecting the other. In MVC, the model represents the information (the data) of the application; the view corresponds to elements of the user interface such as text, checkbox items, and so forth; and the controller manages the communication of data and the business rules used to manipulate the data to and from the model.

They go on to show the classic triangle diagram and how it’s baked into various GUI and web frameworks. There’s only one clause in the entire article that hints at something deeper: “Though MVC comes in different flavors…”

Different flavors indeed. In fact MVC is not just a pattern but a whole family of patterns: MVC, MVA, MVP, PAC, Model-Delegate…. It very quickly gets very hairy.

In this article I want to describe one of MVC’s lesser-known variants, the Model-View-Adapter (MVA) pattern, and talk about its advantages over traditional MVC in the context of a Java Swing application.

Read the rest of this entry »

The Pokémon Problem: a new anti-pattern

March 19th, 2009 | John Carrino

Gotta catch 'em all!

It’s always fun to release a new piece of jargon into the wild. I’ve run into a number of bugs in our codebase that caused by an anti-pattern I’d like to dub The Pokémon Problem.

Much like the game of Whac-a-Mole, this is a class of bugs where fixing every occurrence does not prevent the bug from returning in new code: it is easy for code delta to result in an instance of the bug being re-introduced into the code base. Even if you “catch ‘em all“, nothing prevents someone else from introducing new Pokémon bugs later.

Not only is this bug easy to re-introduce, but it sometimes can be hard to find all currently existing instances of this pattern. Although tools like Eclipse make it easier to track down all the places that code is called, sometimes you’re looking for things that happen in a certain sequence (which tools like Eclipse don’t do a good job of searching for) and dynamic invocation mechanisms like Java Reflection can sometimes make it impossible to be exhaustive. This type of bug is also resistant to automated refactoring: changing the protocol of dealing with this corner of your code will require you to track down all places it was touched and manually refactor them. It generally signals a failure to use sufficient separation of concerns.

In general, this anti-pattern is a result of APIs that require the caller to be responsible for state management of resources that the API owns. This can include things like an object that requires the caller to have run an initialization method before calling any other method on the object. These bugs get even more insidious when a failure to do things in the right order does not cause a hard failure (like throwing an exception) but instead creates some sort of subtle corruption that may not be noticed or cause subsequent calls to fail unexpectedly.

Read on for some strategies on dealing with the Pokémon problem.
Read the rest of this entry »

Model Resolution in Palantir Finance: avoiding N2

February 2nd, 2009 | Andy Aymeloglu


N2, with N = 8

One of the big challenges in Palantir Finance comes when integrating data from multiple data providers. When the server is launched, it needs to create a coherent model of the financial world based on data coming from potentially dozens of data providers. Each data provider defines a set of “models” that it supports. These models can be things like equities, currencies, futures, options, or even new types that the providers themselves define.

The major challenge occurs when multiple providers define models that represent the same real-world entity. Provider A might know about Google, have basic open/high/low/close data for the stock, and know its ticker, country, and ISIN. Provider B might also provide a Google model, have balance sheet data, and know its country, exchange, and ISIN. We want to expose only one Google model to the user, however, and so we need a means of resolving the two Googles together – recognizing that they’re the same instrument – and adding just one equity to the system that encompasses both.

Resolution logic can be fairly complicated. For equities, for example, there are several different ways in which resolution can take place. If two equities have identical ISINs, we can be pretty confident they match, since those identifiers are declared as globally unique. If two equities have the same ticker and the same country of exchange, we might also consider that a match, though perhaps of weaker quality. Two models resolve to each other if any form of resolution considers them equal (with errors being thrown if other forms of resolution contradict the form that considers them equal…i.e. provider A and provider B agree on an instrument’s ISIN but disagree on its ticker).

Read on for the details of how we solve this seemingly n2 problem with a linear solution.
Read the rest of this entry »

Deploying a distributed system

October 7th, 2008 | Bob McGrew

Distributed systems diagram

At Palantir, we write software that gets deployed at each client, integrated across their sensitive data sets, and maintained and administered by that client’s in-house admins. Most deployed enterprise software is run on a single beefy box: consider wikis, blogging systems, bug tracking systems, or practically any client/server or web client software software used today. On the other hand, most enterprise software that runs as a distributed system is hosted: Salesforce.com, Google Apps, or any approach that sells software as a service. What’s fairly unusual about our software is that it’s deployed as a distributed system at each client.

Distributed systems are hard to build and hard to maintain. As long as that distributed system is built and maintained in-house, however, you have a number of advantages:

  • The administrators are full-time product experts who are focused on the mission of keeping your system available and responsive.
  • The development organization can build internal tools for the administrators that only have to be “good enough” and can step in if necessary.
  • It’s easy to get feedback on how the system performs, because there are no sensitivity, privacy, or legal constraints.
  • A single, large deployment allows you to optimize your hardware purchasing and amortize installation headaches across a large number of machines.

This is all great, of course, and if you can host and maintain your distributed system yourself, I’d highly recommend it. Sometimes, however, it’s just not possible. At Palantir, the client data we work with is so sensitive that even we cannot see it, except under very strictly controlled circumstances. It’s also so large that the bandwidth limitations of pushing it into a system hosted by us would be prohibitive.

So suppose that you have to deploy your distributed system in a customer datacenter with external parties maintaining the system. What do you need to consider? In this post, I’ll go into a number of key points that we have faced and addressed at Palantir.

Read the rest of this entry »

Time Chooser Components

April 8th, 2008 | Nick Miyake

Time Choosers Thumbnail
A montage of our time choosers. (click for webstart demo)

The notion of time is central to both of our products at Palantir, and there are many instances in which the user needs to specify a certain point in time. Although there are simple ways to create choosers (you could use a JSpinner that uses a SpinnerDateModel or simply use multiple JComboBox objects), I decided to experiment with writing some more visual time chooser components. These components are fairly experimental — they aren’t used in either product yet and I coded them up pretty quickly so I could get some feedback.

You can see these choosers in action in our office furniture in Bulgaria
webstart demo. The source code is available in the office furniture in Bulgaria
JAR.

If anyone has any feedback or suggestions as to how these choosers could be improved (or any ideas on how to make a better time chooser altogether) please leave a comment and let me know!

Meanwhile, if you want to know a little bit more about these choosers and how I went about designing them, read on…

Read the rest of this entry »

SwingUtilities.invokeAndWait… doesn’t.

February 21st, 2008 | Carl Freeland

One of the most misunderstood aspects of multithreaded Swing applications is care and feeding of SwingUtilities.invokeAndWait. Hans Muller and Kathy Walrath authored a nice article that includes an overview of when to use invokeLater or its slightly more risky sibling, invokeAndWait.

We often use worker threads to do some long-running process, so often run into two issues using SwingUtilities invokeLater/invokeAndWait, and have developed wrapper code to deal with it. One issue is executing code from both worker threads and the Event Dispatch Thread (EDT). invokeLater and invokeAndWait both throw exceptions if executed on EDT. Second, invokeAndWait isn’t guaranteed — interruption on the calling thread will resume execution before the job is finished. The remainder of this post shows the code we used to solve these issues.

Read the rest of this entry »

Writing JUnit tests for memory leaks

November 6th, 2007 | Nick Miyake

LeakMemory leaks are no fun — to find them, you usually have to do a work flow while using a profiler like YourKit, force garbage collection, capture a memory snapshot, and then manually go through the snapshot looking for objects that you expected to be gone but are still there.Even after you’ve finally fixed the memory leak, how can you make sure that the issue doesn’t resurface later? Every good developer knows that, if you fix a bug, you should probably also have a JUnit test for that bug so that it doesn’t happen again. But how can you test for memory leaks programmatically?Find out after the jump… Read the rest of this entry »


Palantir