Satellite trail identification in GalaxyZoo images

Last year I attended the .dotastro 5 conference in Boston (and I will be at the successor, .dostastro 6 this year in Chicago). It was a really fun conference to attend, since astronomy is a field I’m trying to eventually be able to contribute to in a non-amateurish way. Coming from the computational side of the house, what better way to jump in than to find some interesting computational problems that are relevant to the astronomy community!

On hack day at the conference, I joined in with the group working on the ZooniBot – a program developed the previous year to aid the users of the Zooniverse forums by automatically answering questions without the need for moderators to manually jump in and respond. Early in the day on Hack Day, I asked people in the room what would be useful things for ZooniBot to be able to do that it couldn’t already do. While I didn’t spend much time on it on Hack Day, one of the problems that I really liked and has followed me since then was suggested by Brooke Simmons and is intended to address a common question that comes up on the GalaxyZoo site.  Consider the following image:

A satellite trail
A satellite trail

To the eye of a Zooniverse user who is new to the Galaxy Zoo, it is quite likely that this would stand out as unusual. What is the streak? These common artifacts are caused by satellites zipping along in the field of view while the image was being captured. A satellite appears as a bright streak that flashes across the image, and due to its speed, frequently traverses the frame while one color filter is engaged – which is why the bright streaks tend to look green. Of course, a few lucky satellites wandered past while other filters were engaged.

A red trail
A red trail

In some cases, two artifacts crossed the field of view resulting in trails of different colors.

Two trails in one image with different colors.
Two trails in one image with different colors.

How do we build an automatic system that can inform curious GalaxyZoo users that these images contain satellite trails? Given that the image data set is huge and unannotated with metadata about what kinds of artifacts that they contain, such a question requires the automatic system to look at the image and perform some kind of image analysis to make a guess as to whether or not the image contains an artifact like a trail. At first blush, this seems relatively straightforward.

For this specific kind of artifact, we can make a couple of observations:

  1. The artifacts appear as straight lines.
  2. The artifacts commonly appear in one color channel.

The natural building block for an automated detector is the Hough transform. The basic concept behind the Hough transform is that we can take an image and compute its corresponding Hough image. For example, in the example of the two lines above, the corresponding Hough image is:

Hough image
Hough image

In the source image, we have a set of pixel values that have an (x,y) location as well as an intensity value in each of the color channels (r,g,b). Before applying the Hough transform, we map each pixel to a binary value indicating whether or not the pixel is bright enough to be considered on or off. This is achieved by applying some form of image segmentation that maps the RGB image to a binary image. In this example, I used Otsu’s method for computing the best threshold for each image.  Once the binary image is available, the Hough transform looks at every line that goes through the image by varying the angle of the line over [0,pi], and for every offset from the upper left corner of the image to the upper right corner.  The results is the Hough image we see above, where the X axis corresponds to the angle, and the Y axis corresponds to the line offset.  The intensity for each angle/offset combination is the number of pixels that were set to 1 in the binary image along the corresponding line.  As we can see, there are two bright spots.  Looking more closely at the lines that correspond to those bright spots (known as “Hough peaks”), we see that they match relatively well to the two lines that we see in the blue and green channels.

Hough transform results on an image with two trails.
Hough transform results on an image with two trails.

Once we’ve applied the Hough transform to identify the lines in an image, we can then extract out the pixel values in the original image along the detected lines.  For example, from the image above where we had two lines (one blue, one green), we can see the intensity in each of the three color channels in the plots below.

Blue line extracted from the image with two trails.
Blue line extracted from the image with two trails.
Green line from the image with two trails.
Green line from the image with two trails.

Once we have extracted lines, we are left with a set of candidate lines that we’d like to test to see if they meet the criteria for being a trail. The important thing to do is filter out line-like feature that appear that aren’t actually trails, like galaxies that are viewed edge-on. For example, the following image will yield a strong line-like feature along the center of the galaxy.

A galaxy viewed edge on that is detected as a line.
A galaxy viewed edge on that is detected as a line.

A simple filter is achieved by taking all pixels that lie upon each detected line and computing some basic statistics – the mean and standard deviation of the intensity along the line. If our goal is to find things that are dominant in one color channel, then we can ask the following: is the mean value for one color channel significantly higher than the other two channels? The heuristic chosen in the filter currently tests if the channel with the highest mean is at least one standard deviation (computed on its intensities) from the others.

Unfortunately, the data set has curveballs to throw at this test. For example, sometimes trails are short and don’t span the full image.

A trail that spans only part of the image.
A trail that spans only part of the image.

These trails are harder to detect with a basic heuristic on the mean and standard deviation of intensities along the line since the portion of the detected line that covers the region of the image where the trail is missing drag the mean down and push the standard deviation up. Even worse, there are images that for whatever reason have saturation in a single channel all over the image, meaning that any line that is detected ends up passing the heuristic test.

An image with high saturation everywhere in a single color channel.
An image with high saturation everywhere in a single color channel.

Clearly something a bit more discerning than heuristics based on basic summary statistics is necessary.  This work is ongoing, and will hopefully eventually lead to something of value to the GZ talk community.  In the meantime, I’ve put the code related to this post up on github for folks curious about it.  If this topic if of interest to you, feel free to drop me an e-mail to see what the status is and what we’re currently up to with it.  I’m eager to see what new problems like this that .dotastro 6 will send my way.



Basic astronomy databases with FSharp

This is a short post about a small experiment migrating some Python code for working with astronomical databases to FSharp.  The basic task that I want to perform is to take an object identifier for an object from the Sloan Digital Sky Survey (SDSS) DR10 and look up what it is known to be. For example, say I am presented an image like the following example from the GalaxyZoo image set:


Given just the image and its object identifier (1237645941297709234), we might be curious to learn a few things about it:

  • Where in the sky did it come from?
  • What kind of object is it?

Answering these questions requires a bit of work.  First, we need to query the SDSS database to retrieve the (ra, dec) coordinates of the object.  Once we have this, it is possible to then go to another database like SIMBAD to learn if it is a known astronomical object, and if so, what kind of object it is.

Both the SDSS and SIMBAD databases are accessible via HTTP queries, making programmatic access easy.  In this post I’m using their basic interfaces.  SDSS offers a number of access methods, so there is likely to be a cleaner one than I’m using here – I’m ignoring that for the moment.  SIMBAD on the other hand presents a relatively simple interface that seems to predate modern web services, so instead of well structured JSON or some other format, dealing with its response is an exercise in string parsing.

To start off, I defined a few types that are used to represent responses from SIMBAD.

type SimbadObject =
  | S_Galaxy 
  | S_PlanetaryNebula 
  | S_HerbigHaro 
  | S_Star 
  | S_RadioGalaxy
  | S_GalaxyInGroup 
  | S_GalaxyInCluster 
  | S_Unknown of string

type SimbadResponse  = 
  | SimbadValid of SimbadObject
  | SimbadError of string
  | SimbadEmpty

Interpreting the SIMBAD object type responses was a simple exercise of matching strings to the corresponding constructors from the SimbadObject discriminated union.

let interpret_simbad_objstring s =
  match s with
  | "PN" -> S_PlanetaryNebula
  | "HH" -> S_HerbigHaro
  | "Star" -> S_Star
  | "RadioG" -> S_RadioGalaxy
  | "Galaxy" -> S_Galaxy
  | "GinGroup" -> S_GalaxyInGroup
  | "GinCl" -> S_GalaxyInCluster
  | _ -> S_Unknown s

Before moving on, I needed a few helper functions. The first two exist to safely probe lists to extract either the first or second element. By “safely”, I mean that in the case that a first or second element doesn’t exist, a reasonable value is returned. Specifically, I make use of the usual option type (Maybe for the Haskell crowd).

let getfirst l =
  match l with
  | [] -> None
  | (x::xs) -> Some x

let getsecond l =
  match l with
  | [] -> None
  | (x::xs) -> getfirst xs

I could roll these into a single function, “get_nth”, but as I said, this was a quick exercise and these functions play a minor role in things so I didn’t care much about it. Another utility function that is required is one to take a single string that contains multiple lines and turn it into a list of lines, excluding all empty lines. This function should also be agnostic about line terminators: CR, LF, CR-LF all should work.

let multiline_string_to_lines (s:string) =
  s.Split([|'\r'; '\n'|])
  |> Array.filter (fun s -> s.Length > 0)
  |> Array.toList

With these helpers, we can finally write the code to query SIMBAD. This code assumes that the FSharp.Data package is available (this is accessible via Nuget in VS and under mono on Mac/Linux). Given a coordinate (ra,dec), we can define the following function:

let simple_simbad_query (ra:float) (dec:float) =
  let baseurl = ""
  let script = "format object \"%OTYPE(S)\"\nquery coo "+string(ra)+" "+string(dec)
  let rec find_data (lines:string list) = 
    match lines with
    | [] -> SimbadEmpty
    | (l::ls) -> if l.StartsWith("::data::") then
                    match getfirst ls with
                    | None -> SimbadEmpty
                    | Some s -> SimbadValid (interpret_simbad_objstring s)
                 elif l.StartsWith("::error::") then
                    match getsecond ls with
                    | None -> SimbadEmpty
                    | Some s -> SimbadError s
                    find_data ls

  |> multiline_string_to_lines 
  |> find_data

The first two lines of the function body are related to the SIMBAD query – the base URL to aim the request at, and the script that will be sent to the server to execute and extract the information that we care about. The script is parameterized with the ra and dec coordinates that were passed in. Following those initial declarations, we have a recursive function that spins over a SIMBAD response looking for the information that we wanted. When all goes well, at some point in the SIMBAD response a line that looks like “::data::::::::” will appear, immediately followed by a line containing the information we were actually looking for. If something goes wrong, such as querying for a (ra,dec) that SIMBAD doesn’t know about, we will have to look for an error case that follows a line starting with “::error::::::”. In the error case, the information we are looking for is actually the second line following the error sentinel.

In the end, the find_data helper function will yield a value from a discriminated union representing SIMBAD responses:

type SimbadResponse = 
  | SimbadValid of SimbadObject
  | SimbadError of string
  | SimbadEmpty

This encodes valid responses, empty responses, and error responses in a nice type where the parameter represents the relevant information depending on the circumstance.

With all of this, the simple_simbad_query function body is formed from a simple pipeline in which an HTTP request is formed from the base URL and the query script. This is fed into the function to turn a multiline string into a string list, and then the recursive find_data call is invoked to scan for the data or error sentinels and act accordingly. Nothing terribly subtle here. What is nice though is that, in the end we get a well typed response that has been interpreted and brought into the FSharp type system as much as possible. For example, if an object was a galaxy, the result would be a value “SimbadValid S_Galaxy”.

A similar process is used to query the SDSS database to look up the (ra, dec) coordinates of the object given just its identifier.

let simple_sdss_query (objid: string) =
    let sdss_url = ""
    let response = Http.RequestString(sdss_url,
                                             "cmd","select * from photoobj where objid = "+objid],
    let jr = JsonValue.Parse(response)

    let elt = jr.AsArray() |> Array.toList |> List.head

    let first_row = elt.["Rows"].AsArray() |> Array.toList |> List.head
    let ra, dec = (first_row.["ra"].AsFloat()) , (first_row.["dec"].AsFloat())
    ra, dec

As before, we form the request to the given URL. Fortunately, SDSS presents a reasonable output format – instead of a weird textual representation, we can ask for JSON and take advantage of the JSON parser available in FSharp.Data. Of course, I immediately abuse the well structured format by making a couple dangerous but, in this case, acceptable assumptions about what I get back. Specifically, I immediately turn the response into a list and extract the first element since that represents the table of results that were returned for my SQL query. I then extract the rows from that table, and again collapse them down to a list and take the first element since I only care about the first row. What is missing from this is error handling for the case when I asked for an object ID that doesn’t exist. I’m ignoring that for now.

Once we have the row corresponding to the object it becomes a simple task of extracting the “ra” and “dec” fields and turning them into floating point numbers. These then are returned as a pair.

Given this machinery, it then becomes pretty simple to ask both SDSS and SIMBAD about SDSS objects. Here is a simple test harness that asks about a few of them and prints the results out.

let main argv = 
    let objids = ["1237646585561219107"; "1237645941297053860"; "1237646586102349867"; "1237646588244918500"; "1237646647297376587"; "1237660558135787607"; "1237646586638827702"; "1237657608571125897";

    for objid in objids do
        let ra, dec = simple_sdss_query objid
        let objtype = simple_simbad_query ra dec
        let objstring = match objtype with
                        | SimbadValid s -> "OBJTYPE="+(sprintf "%A" s)
                        | SimbadError s -> "ERROR="+s
                        | SimbadEmpty   -> "Empty Simbad response"
        printfn "RA=%f DEC=%f %s" ra dec objstring


The resulting output is:

RA=77.538556 DEC=-0.946330 OBJTYPE=S_Galaxy
RA=62.098838 DEC=-1.075872 OBJTYPE=S_Galaxy
RA=87.319430 DEC=-0.591992 ERROR=No astronomical object found :
RA=76.117486 DEC=1.164739 ERROR=No astronomical object found :
RA=71.760602 DEC=0.066689 OBJTYPE=S_GalaxyInCluster
RA=69.348799 DEC=25.042414 OBJTYPE=S_PlanetaryNebula
RA=86.456666 DEC=-0.086512 OBJTYPE=S_HerbigHaro
RA=122.733827 DEC=36.829334 OBJTYPE=S_RadioGalaxy
RA=345.357080 DEC=-8.465958 OBJTYPE=S_Galaxy

A few closing thoughts. My goal with doing this was to take advantage of the type system that FSharp provides to bring things encoded as strings or enumerated values into a form where the code can be statically checked at compile time. For example, we have three possible SIMBAD responses: valid, error, or empty. Using discriminated unions allows me to avoid things like untyped Nil values or empty strings, neither of which capture useful semantics in the data. I’ve also isolated the code that maps the ad-hoc string representations used in the database responses in specific functions, outside of which the string-based nature of the response is hidden such that the responses can be consumed in a type safe and semantically meaningful manner. An unanticipated response will immediately become apparent due to an error in the string interpretation functions, instead of potentially percolating out into a function that consumes the responses leading to hard to debug situations.

Of course, there are likely better ways to achieve this – either better FSharp idioms to clean up the code, or better interfaces to the web-based databases that would allow me to use proper WSDL type providers or LINQ database queries. I’m satisfied with this little demo though for a two hour exercise on a Saturday night.

Updates coming for out-of-date public code

For anyone who has read any of my blog posts from the last couple years and tried to run the code unsuccessfully, this is my fault.  I didn’t include proper cabal files with each program to ensure that you had the right versions of the libraries they depend on when you try to build them.  As such, nothing is there to stop you from unsuccessfully attempting to build them against the latest and greatest library versions on Hackage, a few of which have changed in API-incompatible ways.  This will be repaired in the not-too-distant future.  In the meantime, the safest way to make sure they build is to look at the blog posting date, and install the version of the library from hackage that was current around that time.  As far as I know, the only major problem dependency is the Gloss visualization library which underwent a few minor API changes that broke a number of my examples.

2012 Summer School on Geometry and Data

This summer in Moscow, Idaho, I will be participating in a summer school focused on mathematical methods in data analysis from the geometric perspective.  The title is the “2012 Summer School on Geometry and Data.”  Specifically, from the description of the program:

The topic of the summer school will be research at the intersection of geometric measure theory and geometric analysis with data analysis, especially in the presence of uncertainty.

This should be very interesting, as it is often the case that finding structure and information in a large pile of data is a question of understanding its fundamental geometric properties.  I’ll be participating as one of the people who comes at the problem from the computational side.  Check out the page for the summer school here, which has links to the poster for the session as well as a document that summarizes the topics to be covered.  Students can find information on applying on the page as well.

Simulation, Dynamics, and the FPU Problem

I recently wandered by Powell’s Technical Books (one of the many perks of working in Portland is having this nearby) and left with the book “The Genesis of Simulation in Dynamics: Pursuing the Fermi-Pasta-Ulam Problem“. I think it’s a pretty interesting book relative to others that I’ve picked up on dynamics and chaotic systems. The book revolves around one problem and follows its development since it was first posed in the 1950s as the field of dynamics unfolded around it. I figured I’d give a quick description of the problem that the book discusses here and possibly interest others to grab the book.

The model that the book discusses is known as the Fermi-Pasta-Ulam (or FPU) problem. It is quite simple. Consider a 1D string that is represented as a sequence of line segments connected at n vertices. The connections between vertices are basically springs that obey Hooke’s law. In the model, the focus is on the displacement of these vertices over time. For the case where all vertices are equally spaced, the string doesn’t move at all since it is in an equilibrium state. When they are not uniformly spaced, interesting things start happening. The following video shows the case where the vertices are displaced based on a sine wave.

(Note: the slowing and speeding up of the video is artificial and not part of the model, but is due to variations in compute time per frame on my computer since I recorded this as a screen capture).

The model is defined by tracking the second derivative of the position of the vertices. The unperturbed linear model is described as:

\ddot{q}_j = (q_{j+1}-q_j) + (q_{j-1}-q_j)

Adding in a perturbation that is nonlinear is what introduces the interesting behavior:

\ddot{q}_j = (q_{j+1}-q_j) + (q_{j-1}-q_j) + \alpha \left[(q_{j+1}-q_j)^2 - (q_j - q_{j-1})^2\right]

The parameter alpha allows us to control the amount that this nonlinear term contributes to the evolution of the system. In my example movies, I let alpha be somewhere around 0.5. In Matlab, this is easily written as:

function fpu(n,alpha,s,dt,iters)
  % initial positions for n points on the line 
  % and two additional points for the fixed ends
  q = (0:n+1)./(n+1);
  qrest = (0:n+1)./(n+1);

  % velocities start as zero
  qvel = zeros(1,n+2);
  % perturbation added with a sinusoid
  q = q + 0.1.*sin(s.*q .* pi);

  % make sure ends are fixed at 0 and 1
  q(n+2) = 1; q(1) = 0;
  for i=1:iters
    % first term of q"
    qterm1 = q(3:end)-q(2:end-1);
    % second term of q"
    qterm2 = q(1:end-2)-q(2:end-1);

    % qacc is q".  note that (a-b)^2 = (-(b-a))^2
    qacc = qterm1 + qterm2 + alpha .* (qterm1.^2 - qterm2.^2);

    % velocity is velocity + acc
    qvel(2:end-1) = qvel(2:end-1) + qacc;
    % position is updated by velocity * time step
    q(2:end-1) = q(2:end-1) + qvel(2:end-1).*dt;

Adding a few strategic plot commands at the end of that loop lets you visualize it and see the plot I showed above.

The goal of the model originally was to look at how the energy in the system moved between the different vibrational modes of the system. As can be seen pretty clearly in the video above, over time, the single low frequency sinusoid that the system starts with evolves into a more complex motion where higher frequency modes gain energy and contribute more to the state of the system.

We can look at the energy in each mode by writing the Hamiltonian of the system as:

H_0(a,\dot{a}) = \frac{1}{2}\sum_{k=1}^n(\dot{a}_k^2 + \omega_k^2a_k^2)


\omega_k = 2 sin \left(\frac{k \pi}{2(n+1)}\right)

So, if we compute each component of the sum and look at them individually (instead of looking at the overall energy of the system represented by H_0), we can see how the different modes contribute to the overall energy of the system. The code above can be easily modified to add this computation after the update to the q vector:

    for k=1:n
        a(i,k) = (sqrt(2/(n+1))) * sum(q(2:end-1).*sin((1:n)*k*pi/(n+1)));
        ap(i,k) = (sqrt(2/(n+1))) * ...
    h(i,:) = 0.5 .* (ap(i,:).^2 + omega.^2 .* a(i,:).^2);

Here is a look at the energy of modes 2-8 for the first 30000 iterations of the Matlab code:

As we can see, over time some of the higher frequency modes begin to gain more energy. This is apparent in the video as well.

Just for fun, we can also look at an instance of the problem where we intentionally start off with a perturbation. In this case, we start off with initial conditions that cause the right-most link in the chain to be stretched further than the rest, leading to a wave that moves back and forth down the string. The impact of this on the energy distribution is interesting as well. This video was created using the code above, but instead of passing a nice integer value for the parameter s (like 1.0), I passed in a value like s=1.3.

Looking at the plot of the energy in each mode, we see that now things are less regular. For example, look at the plot for mode 3 – instead of being a nice smooth peak, the shape of the peaks over time changes. This plot spans the first 3000 iterations of the model for this case, allowing us to look a bit more carefully at the behavior at a fine time scale.

Overall, I think this problem is an interesting one to play with and use to learn a little about dynamical systems. I’m not done with the book yet, but so far it is pretty interesting. Fortunately, these days we have computers far faster than those that Fermi and friends had, so it is possible to casually play with problems like this that were quite challenging to work on back in the earliest days of computing.

Harmonic Motion Simulation

(Edit: This article has been revised to fix a problem with the original in which the model didn’t properly reproduce the video of the real system. That’s all fixed now.)

The model described in this post is pretty simple. Recently I found a pointer to this page that demonstrated harmonic motion with a collection of pendulums of different lengths. The apparatus shown here is based on a system described by Richard Berg in the article “Pendulum waves: A demonstration of wave motion using pendula” from the American Journal of Physics, Volume 59, Issue 2, 1991. Here is the video from the Harvard page showing the apparatus at work:

It seemed like an interesting and simple physical system, so I decided to see if I could replicate it computationally. This is fairly basic physics – this system is probably early undergrad level physics and math, right around the time when you first encounter non-linear partial differential equations.

To start, let’s remember what the equation of motion is for a damped pendulum (using Newton’s notation for derivatives):

\ddot{\theta} + \gamma \dot{\theta} + \frac{g}{l} sin(\theta) = 0

In this equation, \gamma is the damping coefficient, g is gravitational acceleration, and l is the length of the pendulum. In the video shown at the end, we assume that the damping coefficient is zero, so our little computational pendulum exists in a vacuum. If you download the code and set it to a non-zero value, you will see a nice damping out of the motion over time as one would expect in reality due to air resistance.

In the simulation, the variable we are concerned with is \theta:

The second order partial differential equation that describes this system isn’t one that is easily solvable directly, so we resort to methods from numerical analysis to attack it. In this case, it is common to employ a fourth order Runge-Kutta scheme (see this document for a derivation of the Runge-Kutta scheme from the second order PDE). To start, we rewrite the equation in two parts:

\omega = \dot{\theta}
\dot{\omega} = -\gamma \omega - \frac{g}{l}sin(\theta)

Now that we have the system in terms of simultaneous first order equations, we can use the Runge-Kutta scheme to define a sequence of \theta and \omega values that are easily computable as:

\theta_{n+1} = \theta_n + \frac{1}{6}(k_{1a} + 2k_{2a} + 2k_{3a} + k_{4a})
\omega_{n+1} = \omega_n + \frac{1}{6}(k_{1b} + 2k_{2b} + 2k_{3b} + k_{4b})

Where we have the four coefficients for each defined for a time step \Delta t as:

k_{1a} = \omega_n \Delta t
k_{2a} = (\omega_n + \frac{k_{1b}}{2}) \Delta t
k_{3a} = (\omega_n + \frac{k_{2b}}{2}) \Delta t
k_{4a} = (\omega_n + k_{3b}) \Delta t


k_{1b} = f(\theta_n, \omega_n, t) \Delta t
k_{2b} = f(\theta_n + \frac{k_{1a}}{2}, \omega_n + \frac{k_{1b}}{2}, t + \frac{\Delta t}{2}) \Delta t
k_{3b} = f(\theta_n + \frac{k_{2a}}{2}, \omega_n + \frac{k_{2b}}{2}, t + \frac{\Delta t}{2}) \Delta t
k_{4b} = f(\theta_n + k_{3a}, \omega_n + k_{3b}, t + \Delta t) \Delta t


f(\theta, \omega, t) = -\gamma \omega - \frac{g}{l}sin(\theta)

At this point, writing the code is really straightforward. First, define f. One change that we will make is that we will also specify the length of the pendulum, l, as a parameter. This won’t impact the derivations above though – it just makes it easier later on to compose together the set of individual pendulums of different lengths to mimic what we see in the video.

g :: Double
g = 9.80665 -- gravitational acceleration = 9.80665m/s^2

f :: Double -> Double -> Double -> Double -> Double
f theta omega t l = (-gamma)*omega - (g/l)*sin(theta)

Note that the t value isn’t used. We keep it for completeness, given that f as formulated formally is a function of \ddot{\theta}, \dot{\theta}, \theta, and t. Once we have this, we can then define the code that implements one step of the Runge-Kutta solver for a given time step dt:

dt :: Double
dt = 0.01

solve :: (Double, Double, Double, Double) -> (Double, Double, Double, Double)
solve (theta, omega, t, l) =
  (theta + (1/6)*(k1a + 2*k2a + 2*k3a + k4a),
   omega + (1/6)*(k1b + 2*k2b + 2*k3b + k4b),
   t', l)
    t' = t + dt
    k1a = omega * dt
    k1b = (f theta omega t' l) * dt
    k2a = (omega + k1b/2) * dt
    k2b = (f (theta + k1a/2) (omega + k1b/2) (t' + dt/2) l) * dt
    k3a = (omega + k2b/2) * dt
    k3b = (f (theta + k2a/2) (omega + k2b/2) (t' + dt/2) l) * dt
    k4a = (omega + k3b) * dt
    k4b = (f (theta + k3a) (omega + k3b) (t' + dt) l) * dt

Now, the original video that inspired me to spend my Friday evening playing with this model included a set of pendulums of different lengths. The Berg paper describes the lengths as follows: the longest pendulum is intended to achieve X oscillations over a time period of T seconds. Each shorter pendulum should achieve one additional oscillation over that same time period. Now, we know that the time for one oscillation of a pendulum of length l is given by:

t = 2 \pi \sqrt{\frac{l}{g}}

From this, we can derive an equation for the length of pendulum n (where n runs from 1 to N, for N pendula with the Nth longest):

l_n = g\left(\frac{T}{2 \pi (X+(N-n))}\right)^2

In the Berg paper, X is 60 and T is 54 seconds, which we can then implement:

lengths :: Int -> Double -> Double -> [Double]
lengths n t l =
  let thelen curn = (t/((l+((fromIntegral n)-curn)) * 2.0 * pi))**2 * g
      [thelen (fromIntegral i) | i <- [1..n]]

theta0 = pi / 8
omega0 = 0
npendu = 24
starts = map (\i -> (theta0, omega0, t, i)) (lengths npendu 54.0 60.0)

So, every pendulum starts at \theta = \frac{\pi}{8}, and we have 24 pendulums that take on lengths corresponding to those in the paper.

In order to watch the code run and replicate (as best as possible with this simple simulation) the video, I wrapped this code with Gloss to add simple visualization capabilities. The code is pretty simple: I set up some constants to define the window size, and then convert the angle into (x,y) coordinates via (x,y) = (l sin(\theta), -l cos(\theta)) where l is the length of the pendulum being rendered.

-- window parameters
winWidth :: Int
winWidth  = 600

winHeight :: Int
winHeight = 600

-- circle radius
cradius   = 5

-- colors for the pendulum and the string
clr = makeColor 1.0 1.0 0.0 1.0
lineclr = makeColor 0.3 0.3 0.3 0.5

-- render one pendulum
renderPendulum :: (Double, Double, Double, Double) -> Picture
renderPendulum (theta, omega, t, l) =
  let x = double2Float $ l*sin theta
      y = double2Float $ -(l*cos theta)
      twidth = ((fromIntegral winWidth) / 2)-15
      theight = ((fromIntegral winHeight) / 2)-15
    Pictures $ [
      Color lineclr $
        Line [(0,0), (x*twidth,y*theight)],
      Color clr $
        Translate (x*twidth) (y*theight) $
        Circle cradius ]

-- render the list of pendulums
renderPendulums ps = Pictures $ map renderPendulum ps

This is then called from the main function that sets up the initial conditions for each, and then uses the Gloss simulateInWindow function to do the animation. Note that we don’t use the time parameter of the simulateInWindow function, and control the time stepping using the dt value defined before.

main :: IO ()
main = do
  let niter = 1000
      theta0 = pi / 8
      omega0 = 0
      t = 0
      npendu = 15
      starts = map (\i -> (theta0, omega0, t, i)) (lengths npendu 54.0 60.0)
    (winWidth, winHeight)
    (1, 1)
    (greyN 0.1)
    (\vp f m -> map solve m)

The result of this is shown in the video below.

If you are interested in playing with this code, you can find it in my public github repository. Enjoy playing with the pendula!

Numerical optimization informed by computational cost for simulation tuning

Here’s another short post that refers interested people to a recently accepted paper for publication.

M. A. Abramson, T. J. Asaki, J. E. Dennis, Jr., R. Magallanez, and M. J. Sottile. “Efficently solving computationally expensive optimization problems with CPU time-related functions”, to appear in Structural and Multidisciplinary Optimization.

A preprint lives here. The abstract is:

In this paper, we characterize a new class of computationally expensive optimization problems and introduce an approach for solving them. In this class of problems, objective function values may be directly related to the computational time required to obtain them, so that, as the optimal solution is approached, the computational time required to evaluate the objective is significantly less than at points farther away from the solution. This is motivated by an application in which each objective function evaluation requires both a numerical fluid dynamics simulation and an image registration process, and the goal is to find the parameter values of a predetermined reference image by comparing the flow dynamics from the numerical simulation and the reference image through the image comparison process. In designing an approach to numerically solve the more general class of problems in an efficient way, we make use of surrogates based on CPU times of previously evaluated points, rather than their function values, all within the search step framework of mesh adaptive direct search algorithms. Because of the expected positive correlation between function values and their CPU times, a time cutoff parameter is added to the objective function evaluation to allow its termination during the comparison process if the computational time exceeds a specified threshold. The approach was tested using the NOMADm and DACE MATLAB software packages, and results are presented.

The basic problem that this paper is related to is parameter optimization for simulations, specifically those that involve fluid dynamics models. The figure above shows a snapshot of a fluid model for the simple lid-driven cavity problem that we used in the paper. In a number of problems that I encountered in my work at LANL, we were faced with the question of finding the optimal set of parameters for a simulation to match data that was experimentally obtained. Often this included algorithmic parameters, like mesh and grid resolutions, along with solver parameters (e.g., convergence criteria), and physical parameters. What is interesting is that a number of parameters have a direct impact on compute time. For example, resolving a grid to a finer resolution will result in a higher per-iteration compute time. The question that we posed was, what can one do if attacking the problem of both tuning parameters based on observed experimental data and identifying parameters that minimize compute time. That’s what this paper is all about.