## Guess the plots!

• June 3rd, 2009
• 5:30 pm

What do these depict?

Here are two others. Different data source, different point in time, but what are they?

They are all linked in pairs – one coloured and one black linked together. They are not sports related. And they are taken from real world data. The colours are relevant and constitute a hint in their own right.

### 19 People had this to say...

• PCA
• June 3rd, 2009
• 20:26

Someone dicking around with PCA

Great idea for a puzzle!

I’ve no idea that they’re plots of. Gives us a clue.

Ok, very willing to guess but could we get some tips/clues at least?

• Michi
• June 3rd, 2009
• 21:57

@PCA: Well, yes. It’s one first step in a project I have to compare dimension reduction algorithms on datasets from one specific domain.

@dysfunctor: The coloration in the first plot will help somewhat.

@Cetin: more clues will come as time goes by.

• Steve I
• June 4th, 2009
• 12:07

We’re in baseball season – well these are from baseball.
My guess at the first plot: blue indicate the location a ball stopped after a strike or a ball. Red indicates the final location (or first bounce) of balls that got a run or someone on-base.

Second plot: if “south” on the plot is home plate, then this is a plot of where the ball was when someone was declared out. This explains the large # of points near 1st, 2nd, and 3rd base. Also if someone is tagged out then that explains the number of points between 1st 2nd and 3rd bases.

• Michi
• June 4th, 2009
• 13:11

Good guess, Steve I, but wrong domain.

Next hint: the first diagram is a dimension reduction of 442 vectors in an 1187-dimensional space, and the second diagram reduces 1187 vectors in a 442-dimensional space.

• John P
• June 4th, 2009
• 21:19

From the colouring it looks like political parties to me.

The 1st pair might be members of congress and votes held?

The 2nd shows the right mix of colours for the parties in UK parliment.

• Michi
• June 4th, 2009
• 21:23

And that brings us to the right subject!
Voting records form a vector, once we choose numerical values for the votes. Done right, each MP or representative is a point in some very high dimensional vector space – so we can use dimension reduction methods to get a geometrical view of the politics.

In US, it is extremely linear. The left-right axis of the two dominant parties is almost the only feature relevant in the point set. In fact, the vertical axis in the red/blue plot is about two orders of magnitude less relevant than the horizontal axis.

Now, though, what are the black plots?

Sweden?

• Michi
• June 5th, 2009
• 9:20

John: No, though I’m working on getting hold of Swedish data. A Swedish plot would have been appropriately colorized.

Remember I mentioned that the black plots in some sense belonged paired up with the party plots.

• Dimitry
• June 7th, 2009
• 16:33

France!

• Michi
• June 7th, 2009
• 18:41

Dimitry: Nope.

The black plots are not merely New Countries, they are qualitatively different.

• Malcolm
• June 9th, 2009
• 8:07

The black plots are some kind of fourier transform of the coloured plots? But I’m having trouble imagining what that might mean visually.

• Michi
• June 9th, 2009
• 9:07

Malcolm: I don’t actually know enough about fourier transforms to tell whether this is the case. There is however a sense of duality to the black points viz. the coloured ones.

• Colin
• June 27th, 2009
• 20:37

Is it the views of their constituents?

• Michi
• June 27th, 2009
• 21:05

Nope.

It is the corresponding plots for the issues in the vector space spanned by the MPs.

• gao yan
• June 30th, 2009
• 2:51

Hi Michi, I am unable to read the full article of your post http://blog.mikael.johanssons.org/archive/2009/05/1-manifolds-and-curves/. The figure and the remaining of the article from the figure do not show up. Can you take a look or send me a copy of the full article to my email? Thanks!

Hi Michi,

This is really interesting. Can you tell me where/how you got your source data? Also, if you have time, a little more information on what you’re doing with the data would be great!

Thanks
Chris (@crntaylor on twitter)

• Michi
• December 8th, 2009
• 18:52

The data for these was downloaded from The Public Whip for the UK data, and embarrassingly, I don’t remember where I grabbed the Canadian data – but I think it might have been off of How’d they vote.

The Canadian parliament now has they’re data up online, queryable at least.

I’ve a running idea about writing a paper, comparing different analysis techniques on political data – PCA vs. non-linear methods such as LLMAP or Laplacian eigenmaps. Not much is happening with the project right now – but that’s why I try to aggregate and analyze voting data.