Michi’s blog » read post

R and topological data analysis

  • August 23rd, 2008

This is extremely early playing around. It touches on things I’m going to be working with in Stanford, but at this point, I’m not even up on toy level.

We’ll start by generating a dataset. Essentially, I’ll take the trefolium, sample points on the curve, and then perturb each point ever so slightly.

idx <- 1:2000
theta <- idx*2*pi/2000
a <- cos(3*theta)
x <- a*cos(theta)
y <- a*sin(theta)
xper <- rnorm(2000)
yper <- rnorm
xd <- x + xper/100
yd <- y + yper/100
cd <- cbind(xd,yd)
 

As a result, we get a dataset that looks like this:
Trifolium data

So, let’s pick a sample from the dataset. What I’d really want to do now would be to do the witness complex construction, but I haven’t figured enough out about how R ticks to do quite that. So we’ll pick a sample and then build the 1-skeleton of the Rips-Vietoris complex using Euclidean distance between points. This means, we’ll draw a graph on the dataset with an edge between two sample points whenever they are within ε from each other.

So we pick a sample from this sample. Every 31 points might be good. (number arrived at by guessing wildly, and drawing the resulting images until they looked pretty enough)

csamp <- cd[seq(1,dim(csamp)[1],31),]
 

We’d get, from this, the following result:
Trifolium sample

Now, we’ll want to build the corresponding skeleton. Let’s do it for a few different εs to demonstrate the difference.

d <- function(x,y,z,w) { sqrt((x-z)^2+(y-w)^2) }
par(mfrow=c(2,2))
eps <- c(0.05,0.1,0.15,0.2)
cols <- c("cyan","green","yellow","red")
for (ei in 1:length(eps)) {
  plot(cd,col="gray")
  points(csamp,col="blue")
  title(eps[ei])
  for (i in 1:dim(csamp)[1]) {
    for (j in 1:dim(csamp)[1]) {
      x <- csamp[i,1]; y <- csamp[i,2]
      z <- csamp[j,1]; w <- csamp[j,2]
      e <- eps[ei]
      if(d(x,y,z,w) < 2*e) {
        segments(x,y,z,w,col=cols[ei])
      } else {
      }
    }
  }
}
 

The result is:
Trifolium skeleta

We notice that as the radius we observe grows, we connect all the loops, but by the time the loops are completely connected, there are also cross connections forming toward the middle. However, with any luck, these cross connections will be so short-lived, in terms of the radii we use, so that the homology classes we extract from the Rips-Vietoris complexes are noticably more persistent.

Doing the corresponding computation relies on me figuring enough out to use the Plex software suite, or writing my own, and thus will be subject of a much later blog post. This one was mainly “Look – I use R” and “Look – pretty pictures”. Enjoy.

Want your say?

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

post navigation
about
Michi is a recent PhD working in homological algebra and applied algebraic topology. This blog is his outlet for texts with some manner of thought put into them. Over at his LiveJournal intimate details and streams of consciousness might be found.
Not all here is mathematics. All here, though, are my personal thoughts and opinions. Please read the about page (linked above) for more details.
This blog uses statcounter.com for logging and traffic analysis. In order to identify return visitors, this site will issue a cookie on viewing the blog.
RSS Travel plans
Recent Comments
Tags
Categories
Blogroll
Family
Mathematician blogs
Archives
the rdc* theme