Random sciency or techy thoughts.

Over a year ago I analyzed restaurant reviews in Budapest and came up with a complimentary way to rank businesses. This was based on the 4- and 2-star reviews rather than the average score. I followed up by repeating this analysis for Toronto. Below are the results (particularly comparing the two cities) and some new thoughts.

Walking long and mildly busy streets in Budapest, I noticed that *3-body* encounters were surprisingly common. 2-body encounters are when two people walking in opposite directions meet, or they walk in the same direction and one passes the other. These seemed rare enough (only a few per minute on my daily route), so I got curious how come a relatively large fraction of them involved a third person walking independently. The full mathematical description of the problem can be found here and a simple applet to calculate the encounter rate using a toy model can be found here.

In more detail, a 3-body encounter is defined as the moment in time at which the largest pairwise separation among the three bodies is the smallest, and such an encounter is considered “interesting” if this separation is below some threshold *ε*. In the paper, I calculated the rate of 3-body encounters undergone by a reference body with velocity *v*_{0}, given that all bodies move at constant speeds in one dimension, their density is *n* per unit length, and the probability density function of their velocities is *f*(*v*). The result is that the rate linearly depends on *ε* and quadratically on *n*.

In figure shows the 2- and 3-body encounter rates as a function of the reference body’s speed in a toy model representing pedestrian movement. The other bodies’ velocity distribution is made up of two rectangular functions with widths of 2 km h^{-1} around ±5 km h^{-1} (also *n* = 120 km^{-1} representing a mildly busy street in Budapest, and *ε* = 0.5 m representing a lower limit on comfortable personal space).

Read more... (PDF, 475 kb)

I frequently encounter the problem of having to find the probability density function (*pdf*) of some quantity which is a function of several independent random variables with known distributions. For some cases such as addition of two variables, there exists a relatively simple formula (the new pdf is just the convolution of the two) but there is no general magic formula.

In a more mathematical language, I need a procedure to find the pdf of a function \(y\) where \(y : \mathbb{R}^n \rightarrow \mathbb{R}\). If \(n=1\) this is quite easy and only involves inverse functions and derivatives. To find an analytical solution in the case where \(n>1\), one would generally have to obtain an expression for the cumulative distribution of \(y\), and whether it is possible or not to do analytically depends on how complicated the function is.

I enjoy good food, and Budapest has a lot to offer in terms of eating out. Choosing a restaurant is a difficult task, and is often based on a single number, the score of a restaurant in an online list. This score is usually a weighted arithmetic mean of customer one to five star reviews, such is the case with TripAdvisor. But how reliable is this number? Let’s find out. In this figure I plot the “standard” average score versus some alternative (complementary) definition.

If you dig straight down, you reach the center of the Earth in 6400 km, but what happens if you dig at a shallow but constant angle? Intuitively the result would be a spiral, and it's easy to calculate, we just need to assume some constant digging “speed” and a constant ratio of the radial and tangential velocity components given by the angle's tangent. In radial coordinates \begin{align*} \dot{r} & =-v\sin\beta\\ r\dot{\phi} & =v\cos\beta. \end{align*} The first equation is trivial to integrate \[ r(t)=r_{0}-v\sin\beta t. \] The second is also pretty easy \begin{align*} \dot{\phi} & =\frac{v\cos\beta}{r_{0}-v\sin\beta t}\\ \phi(t) & =\phi_{0}-\cot\beta\log\left(1-\frac{v\sin\beta}{r_{0}}t\right). \end{align*} We can now also show that it's a logarithmic spiral by isolating \(t\) from \(r(t)\) and substituting in \(\phi(t)\). We get \begin{align*} \phi(r) & =\phi_{0}-\cot\beta\log\left(\frac{r}{r_{0}}\right)\\ r(\phi) & =r_{0}\exp\left[(\phi_{0}-\phi)\tan\beta\right]. \end{align*} The length element is just \(v\mathrm{d}t\) and from here it’s clear that the length of the tunnel to the center of the Earth is \[ L=\frac{r_{0}}{\sin\beta} \]

Long haul flights can be confusing. You start at some time zone and hours later find yourself in another, but in the duration of the flight you exist in timezone limbo. Airlines serve breakfast, lunch or dinner at somewhat arbitrary times, and the Sun’s position in the sky is not always helpful (flights between East Asia and North America pass in the polar circle, where the Sun can be up at midnight).

Sometimes there’s no deep meaning behind a beautiful picture. This image is just an illustration of numerical error. I used my code to calculate the gravitational force in a triaxial ellipsoid; the error distribution when using single vs. double precision is bimodal, and the blue and red dots are points around the two peaks.

This is part of a gravitational *N*-body simulation done with the NOBDY6++ code, stored using the BTS scheme and visualized
with Blender. The simulation has 16,000 point particles, but I only
showed 50 of them as large spheres with random color (so the video is quite misleading). The effective
gravitational force is toward the center because you have that many particles in a spherical configuration.
The reason that at some point they all suddenly change speed and (mostly) move toward the center is that the
rendering software has its own clock (the frame count) which may or may not match the data output from the
simulation software. If in a certain frame the position of a particle is unknown, Blender guesses the position
by interpolation (which here is linear but could be more sophisticated). To test that this interpolation
works, I took data with a large gap, and the result is what you expect. If particles move only a little bit
between snapshots, then this interpolation is fine, but after this large gap the positions are more or less
randomized, so in the video all particles move in straight lines to their new and rather arbitrary positions.

I made this visualization using medical data. First, The CD-ROM from the radiology lab it had the CT scan in
this very weird data format called DICOM that is apparently
a standard for medical imaging. I used a Python package called pydicom to read it into a 3D numpy array representing the
density (or opacity to X-rays or something like that). The array’s shape was 512x512x465 and the values were
integers from 0 to 4095. The scan resolution was about 30.2 dpi (which for a 2D document would be extremely
poor). Strangely along the *z*- (spinal-) axis the resolution is slightly different, 33.9 dpi. I used
MayaVi to plot an isosurface to my liking. I had to
play a lot with the contour value: if too low, too many soft tissues show up, if too high, the costal
cartilages doesn’t show (that’s some piece of cartilage that connects each rib to the sternum). Unfortunately
even with this careful choice, the 3D model had quite a lot of soft tissues which would have been very
difficult to exclude in the Python script. So I exported the model and used Blender to manually remove the soft tissues. This was quite a hassle,
but when it’s done it’s relatively easy to set up this rotation animation in Blender and render in high
quality.

This is a simple script with two functions that translate between AQI and PM_{2.5} in
μg/m^{3}. See Gareth’s investigation
into pollution in Beijing and how it effects visibility.

```
#!/usr/bin/env python
import numpy as np
def PiecewiseLinear(X, Y, x):
try:
i = np.argwhere(x<=X)[0,0]
except:
return x
if X[i]==x: return Y[i]
return Y[i-1]+(Y[i]-Y[i-1])/(X[i]-X[i-1])*(x-X[i-1])
C = np.array([0, 12, 35.5, 55.5, 150.5, 250.5, 350.5, 500])
I = np.array([0., 50, 100, 150, 200, 300, 400, 500])
AQI2ugmc = lambda x: PiecewiseLinear(I, C, x)
ugmc2AQI = lambda x: PiecewiseLinear(C, I, x)
if __name__=='__main__':
from pylab import *
Palette = ['#00e400', '#ffff00', '#ff7e00', '#ff0000', '#99004c', '#7e0023', '#7e0023']
for i in range(len(I)-1):
plot(I[i:i+2], C[i:i+2], color=Palette[i], lw=2)
xlabel('$\mathrm{AQI}$') # $...$ for font consintency
ylabel(r'$\mathrm{PM_{2.5}\ [\mu g \, m^{-3}]}$')
savefig('AQI.svg')
show()
```

I gave a KIAA pizza talk about Mars in January 2015. This was an overview of topics related to Mars that were being discussed at the time. I curated much of the information from various Wikipedia articles and NASA press releases.

The focus is the Martian atmosphere, transient detection of methane, and the Martian meteorite ALH84001.

I was a core local organizing committee member of the 312^{th} Symposium of the International
Astronomical Union (IAU) *Star Clusters and Black Holes in Galaxies across Cosmic Time* (see link) that took place at the National Science Library of
Chinese Academy of Sciences in Beijing in summer 2014. While I didn’t choose this mouthful of a name for the
conference, I designed the lovely poster and was chief editor of the conference
proceedings (hardcover by Cambridge University Press, ISBN-13: 9781107078727).

The poster design was done in Inkscape mostly. I ended up liking the result but it was quite a hassle and I had to try to satisfy too many people in the committee, I will certainly think twice before volunteering to such a task again. The proceedings editing was done in Latex with a lot of Python scripting to administer the many proceedings papers and forming them into a single work.