Skip to Content »

Michi’s blog » archive for 'R'

 Mapping zipcodes in R

  • May 13th, 2009
  • 3:51 pm

I started fiddling around with R again, and ended up playing with a zipcode database.

So, first I downloaded the zipcode database at Mapping Hacks, and unpacked the zipfile in my working directory.

Then, I loaded the data into R

> zips <- read.table("zipcode.csv",sep=",",quote="\"",header=TRUE)
> names(zips)
[1] "zip"       "city"      "state"     "latitude"  "longitude"
[6] "timezone"  "dst"      
 

So, now I have an R frame containing a lot of US cities, their geographical coordinates, and their zip codes. So we can start playing with the plot command! After rooting around a bit, I ended up settling on the smallest footprint plot dot I could make R produce, by setting the option pch=20 in the plot options. Hence, I ended up with a command basically like this:

 R and topological data analysis

  • August 23rd, 2008
  • 4:38 pm

This is extremely early playing around. It touches on things I’m going to be working with in Stanford, but at this point, I’m not even up on toy level.

We’ll start by generating a dataset. Essentially, I’ll take the trefolium, sample points on the curve, and then perturb each point ever so slightly.

idx <- 1:2000
theta <- idx*2*pi/2000
a <- cos(3*theta)
x <- a*cos(theta)
y <- a*sin(theta)
xper <- rnorm(2000)
yper <- rnorm
xd <- x + xper/100
yd <- y + yper/100
cd <- cbind(xd,yd)
 

As a result, we get a dataset that looks like this:
Trifolium data

So, let’s pick a sample from the dataset. What I’d really want to do now would be to do the witness complex construction, but I haven’t figured enough out about how R ticks to do quite that. So we’ll pick a sample and then build the 1-skeleton of the Rips-Vietoris complex using Euclidean distance between points. This means, we’ll draw a graph on the dataset with an edge between two sample points whenever they are within ε from each other.