This is apparently becoming a series on making various parts of my reference-managing toolchain play nicely together.  One of the great features of Mendeley is that it can automatically sync up your library with a BibTeX file.  One of the annoying things about Mendeley is that you can’t control what fields it puts in that file.  Why is this annoying?  Because apacite overloads the @article entry type to cover both journal articles and newspaper/magazine articles.  These have different APA citation formats, though: one should include the month of publication, while the other shouldn’t.  The only way to control which format is produced is to include or omit the month in the BibTeX entry.

At this point you might be wondering why I haven’t just switched to something a little more sensible for doing APA citations in LaTeX.  I probably will.  But in the mean time, here’s the solution I’ve cobbled together: monitor the library.bib file that Mendeley exports for changes, and then remove all the month lines.  The monitoring can be accomplished very easily on Mac OS X using launchd, and the removing is done with a shell script: 

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" ""&gt;
<plist version="1.0">
cd /Users/dkleinschmidt/Documents/papers
# remove month and url lines in bib entries
egrep -v '^(month|url)' > library-clean.bib < library.bib
# write a message to the log
/usr/bin/logger library.bib modified and cleaned
# this whole solution was adapted from
# you'll have to change the hard-coded locations to match whereever you put these
# files and Mendeley exports the .bib file.
# install the watcher using launchctl:
launchctl load /Users/dkleinschmidt/Documents/papers/clean-library-auto.plist
view raw hosted with ❤ by GitHub

Latexdiff is a script that produces a word-by-word diff of two versions of a latex file that can be compiled to mark up additions and deletions.  It generally works well for me except for one important caveat: it doesn’t recognize the citation command syntax from apacite, the package I use to get APA-style citations.  Luckily, being open-source, I could just fork the repository and fix it.

Latexdiff uses a latex-aware difference algorithm that treats latex commands as single “words”. The problem is that apacite allows citations of the form

\cite<text before citation>{AuthorYear}

Normally, a \cite command is only followed by arguments in […] or {…}. Latexdiff sees the < and thinks that the command is over and it can safely split things.  But this results in the \cite command being separated from its arguments in the diff which then refuses to compile.

The fix is easy and should be incorporated into the next version that’s released. But in the meantime, if you’re suffering as I was you can clone my fix:

git clone
cd latexdiff
git checkout anglebracket-cite-command
ln -s latexdiff /usr/local/bin/latexdiff   # or wherever you like

At long last, here is a reasonably complete demo version of the online experiment paradigm I’ve been using.

HLP/Jaeger lab blog

I’ve developed some JavaScript code that somewhat simplifies running experiments online (over, e.g., Amazon’s Mechanical Turk). There’s a working demo, and you can download or fork the source code to tinker with yourself. The code for the core functionality which controls stimulus display, response collection, etc. is also available in its own repository if you just want to build around that.

If you notice a bug, or have a feature request, open an issue on the issue tracker (preferred), or comment here with questions and ideas. And, of course, if you want to contribute, please go ahead and submit a pull request. Everything’s written in HTML, CSS, and JavaScript (+JQuery) and aims to be as extensible as possible. Happy hacking!

If you find this code useful for your purposes, please refer others to this page. If you’d like to cite something to acknowledge this code or your…

View original post 96 more words

I finally updated my academic website after, like, a year. One thing that I had been meaning to do was make little drop-down things for the projects/topics I’m working on or interested in. It’s pretty straightforward with JQuery, but I couldn’t find a ready-made solution for what I wanted to do, so I figured it might be useful to post it here.

What I wanted to do was to have a title for each little section, which could be clicked to expand or collapse that individual section. The most obvious solution is to toggle the visibility of each section’s div every time it’s clicked on, but I didn’t want to futz with fancy CSS to make the headline for each section remain when the rest was hidden. Instead, I just toggle the visibility of the siblings of the headline instead, which is everything else contained in the same section as the headline.

Here is the Javascript:

$(document).ready(function() {
    $(".sliding > h3.slidingtitle").click(function() {
    $(".sliding > h3.slidingtitle").siblings().hide();

The “.sliding > h3.slidingtitle” selector identifies the h3 elements that have class “slidingtitle” and live inside something with class “sliding” (that last part probably not strictly necessary). Here’s what each section’s HTML looks like:

<div class="sliding">
    <h3 class="slidingtitle">Headline goes here</h3>
    <p>Content goes here</p>
    <p>...yet more content</p>

That was easy, eh? JQuery sure is nice.

My tribe—the data nerds—is feeling pretty smug right now, after Nate Silver’s smart poll aggregation totally nailed the election results. But we’re also a little puzzled by the cavalier way in which what Nate Silver does is described as just “math”, or “simple statistics”.  There is a huge amount of judgement, and hence subjectivity, required in designing the kind of statistical models that 538 uses. I hesitate to bring this up because it’s one of the clubs idiots use to beat up on Nate Silver, but 538 does not weight all polls equally, and (correct me if I’m wrong) the weights are actually set by hand using a complex series of formulae.

The point is that the kind of model-building that Nate Silver et al. do is not just “math”, but science. This is why I don’t really like that XKCD comic that everyone has seen by now. Well I like the smug tone, because that is how I, a data scientist, feel about 538’s success. That is right on. But we’ve known that numbers work for a long time. Nate Silver and 538 is not just about numbers, about quantifying things. Pollsters have been doing that for a long time. It is about understanding the structured uncertainty in those numbers, the underlying statistical structure, the interesting relationships between the obvious data (polling numbers) and the less obvious data (economic activity, barometric pressure, etc.) and using that understanding to combine lots of little pieces of data into one, honkin’, solid piece of data. It is about teasing apart the Signal and the Noise. There are an infinity of ways to combine all the polling numbers that 538 aggregates, and let’s just say there is another infinity’s worth of ways to take all that data and make predictions about what will happen in the space of variables that we ultimately care about (like, “who is President in 2014”). It’s not like Nate Silver just sits at his desk with his TI-83 and types in percentage after percentage.

In fact, Joseph Fruehwald makes this point clearly and elegantly, by quantitatively comparing the 538 predictions and the simple average of the very same polls that 538 aggregates to make those predictions. The 538 prediction is something like twice as good (in RMSE terms), and is especially good where either candidate outperformed the polls, meaning there is some “special sauce” that Nate Silver contributes something substantial. Nate Silver isn’t some kind of prophet; there are other poll aggregators who did comparably well. But this whole enterprise is about a lot more than just “using numbers to determine which of two things is bigger”.

I think a good analogy can be made with the whole Sabermetrics trend in baseball (which Nate Silver was involved in, of course). There are lots of ways that a baseball player can be quantified: height, total biomass of body hair, red blood cell count, RBI, slugging percentage, etc. Some of these are very useful in quantifying the individual contribution of a player to the team’s success—and hence their monetary value—while others are not. Knowing which numbers to put into your model, and how, is a step beyond just having the numbers, and that takes some knowledge about the domain—what the numbers mean.

Now that it is November (complete with glorious lake effect snow/rain/snain in Rochester), I’m thinking about the holidays, and, most pertinently, the associated foodstuffs. Today I’m making my mom’s cinnamon swirl bread, which I think started as a special Christmas treat but has expanded to fill the surrounding months which are cold and generally in need to cheer.

It’s a pretty great yeast bread recipe, and has some sugar in it so it’s pretty hard to mess up. Also, each slice has a swirl of cinnamon-sugar through it (hence, you know, the name), so it looks super fancy without being too technically difficult.

Here is the recipe, straight from mom:

Orange Cinnamon Swirl Bread

6 cups bread or all-purpose flour, approximately
2 packages dry yeast
1/3 cup nonfat dry milk
½ cup granulated sugar
1 ½ teaspoons of salt
1 ¼ cups of hot water
½ stick butter, room temperature, or softened in the microwave
1 tablespoon grated orange peel
¾ cup orange juice
1 egg, room temperature

1 tablespoon ground cinnamon mixed with ½ cup sugar
2 teaspoons of water

Measure 2 cups flour into a large mixing or mixer bowl and add the dry ingredients. Pour in the hot water and stir vigorously to blend into a thin batter. Add the butter, orange peel, orange juice, and egg.

Add flour ¼ cup at a time, stirring with strong strokes after each addition until the dough becomes a rough shaggy mass that can be turned out onto a floured work surface.

Knead for 8 minutes. Add a bit more flour if the moisture works through the surface and sticks to the work counter.

Place the dough in a greased bowl, turning the dough to be certain it is filmed on all sides. Cover the bowl tightly with plastic wrap and put aside until the dough has doubled in bulk (approximately 1 hour).

Punch down the dough. Turn onto a floured surface, and divide in two. Cover with wax paper and let rest 10 minutes ( I don’t do this! )

Roll each piece into a 15” by 7” rectangle. Each will be about ½” thick. Spread each piece with the cinnamon sugar mixture, and sprinkle with 1 teaspoon of water. Smooth with a spatula. Roll from the narrow side. Seal the edges securely by pinching tightly and along the seams. Tuck in the ends and place seam down in the pans.

Cover the pans with wax paper and let stand until the dough has doubled in bulk, about 45 minutes.

Preheat the oven to 375˚ 20 minutes before baking.

Bake for 10 minutes, then reduce heat to 325˚, and bake for 30 minutes more, or until the loaves are nicely browned, and test done when tapped on the bottom with a forefinger. The sound will be hard and hollow. (I actually cover the loaves with a large piece of aluminum foil to keep the from burning on top after the first 10 minutes, and remove it during the last ten minutes). Remove from pans and cool.

If you know me at all you know that I love LaTeX.  It let’s you specify the content and logical structure of your document and takes care of making it look nice, including tricky mathematical expressions, yadda yadda yadda.

One thing that LaTeX is really bad at, though, is font support.  This isn’t a problem for the most part, and I’ve actually come to prefer the look of Computer Modern (the only real font ever created using Knuth’s Metafont language, and instantly recognizable to LaTeX geeks the world over).  But if you need to, say, typeset something in Arial (shudder), there isn’t exactly an easy way to do it, and god forbid that you might want to use a font that’s not freely available (like, just an arbitrary example, Times New Roman).

Enter our hero, XeTeX (and it’s big sibling, XeLaTeX).  XeLaTeX extends the low-level typesetting engine of TeX to use modern font/typography technology.  This includes support for Unicode, and super-slick typography conventions like OpenType and Apple Advanced Typography (AAT), which allow the typesetting of scripts with complicated rules for combining symbols (like Tibetan) and different writing directions (e.g. right-to-left scripts such as Arabic and Hebrew).

The really great thing about XeTeX, for my current purposes, is that it allows you to typeset almost anything using any font installed natively on your computer.  That is, XeTeX essentially adds that drop-down font-selection menu that every other text editor has.

As an example, I will show you how stupid-easy it is to typeset a document in 12 pt. Times New Roman (with 1-inch edges, not that it matters)

\setmainfont[Mapping=tex-text]{Times New Roman}

Yes, folks, it really is that easy: just two lines of code (above and beyond the usual documentclass/geometry combo), and nothing whatsoever that needs to be converted using FontForge, etc.  The only trick is that, instead of running latex, you run xelatex (which is easy to automate using emacs and AUCTeX).

As you can probably tell from the snippet above, the package you want to use is fontspec, which is the LaTeX interface for XeTeX’s font-specification system, and its documentation has lots of good examples.

Dork out

my ride, freshly dorkified in anticipation of winter by Dave Kleinschmidt

It’s funny how a set of fenders, a tool bag, a couple of lights, and a water bottle can take a bike from almost-cool to completely dorky.

Look out for updates on living in Rochester, grad school, etc. in the near future. Or just read my tweets.

(Link: photo from flickr)

Circles by Dave Kleinschmidt

What better to do when snowed in than make thick, delicious barley soup? This recipe is from the Moosewood New Classics cookbook and is really hearty. You can use just about any vegetables you want as long as you end up with 6-7 cups—I used Kale, potatoes, sweet potatoes, peppers, and carrots.


  • 1/2 c unhulled raw barley (rinsed and picked over)
  • 7 c water
  • 3 T olive oil
  • 2 c chopped onions
  • 1/4 t salt
  • 1 1/2 c cubed white potato
  • 1/2 c diced celery
  • 1 c diced red or yellow bell peppers
  • 1 c peeled and diced carrots
  • 1 c cut green beans (1-ing pieces)
  • 1 c cubed yellow or green summer squash
  • 1 c chopped mushrooms
  • 1/4 t dried marjoram
  • 1/2 t dried thyme
  • 2 T dry sherry
  • 3 T barley miso (I just used regular miso)
  • ground black pepper to taste
  • 1/3 c chopped fresh parsley
  • chopped scallions (for garnish)

Rinse the barley and boil it in 3 c of the water until it’s tender, which should be about 1 1/4 to 1 1/2 hours. When it’s done, drain the barley. About a half hour before it’s done, start the rest of the stuff going:

Heat the oil in your soup pot and cook the onions and salt until tender and just beginning to brown (eight-ish minutes). While the onions cook, heat the other four cups water to a simmer in another pot.

Stir all the veggies into the onions until everything’s good and covered with oil. Add the herbs and sherry and cook for a couple of minutes, stirring.

Pour the simmering water into the veggies. Mix 1/2 c of the hot water with the miso in a small bowl until you have a smooth paste, and then pour the paste into the soup pot. Add pepper to taste, cover, and simmer until the veggies are tender (about 15 minutes). Add the (drained) barley and parsley, and cook for about 5 minutes more. Top with scallions and serve.

(Link: photo from flickr)


It's coming down by Dave Kleinschmidt

In case you’ve been living under a rock (or, you know, don’t obsessively follow weather-related news), DC got absolutely clobbered with about two feet of snow this weekend. Well, okay, maybe more like 18″ (at National), but that’s enough to rank something like the fourth or fifth biggest snow storms in recorded history down here.

Fun fact, the last time I heard the phrase “potentially historically significant” being tossed around regarding a snow storm, they were forecasting six to eight feet. But that’s New England.

Anyway, I’ve been absolutely over the moon, what with all the shoveling and tromping around and midnight snow-biking (see below). This weather seems to bring out all the Northerners in DC, and to whip them into a little bit of a giddy frenzy. You can identify them by their goofy grins and non-beleaguered looks, or, you know, by the fact that they’re jogging down 13th St carrying a pair of cross-country skis (and exclaiming “good choice” over the six-pack of Bell’s you’re carrying). My downstairs neighbor (an Alaskan) and I bonded over how much we love shoveling out our three-house-long stretch sidewalk for our building (which he did in December, when we got 16″).

I do have to admit, DC weather is pretty nice. Right now it’s spring, as far as I can tell: the sun is warm, temperatures hover around freezing, and there are occasional snow storms. I’ve even started to see robins here and there. Having an actual spring (instead of the muddy mess that passes for spring in New England) will be great, too.

But nothing compares to getting absolutely walloped by a couple of feet of snow and all of the shoveling, trudging (biking?) fun that ensues. Part of the fun is undeniably seeing the absolute panic that this weather sends people from less snowy climates into. There are literally runs on the supermarkets around here at the threat of snow, with people—I shit you not—buying up toilet paper, milk, and canned food. As one Wisconsonite on NPR put it, this kind of weather brings out the survivalist tenancies in people who don’t have to deal with it on a regular basis, which is pretty entertaining for people who know that life will go on, a little messier and a little more fun.

(Link: photo from flickr)

Next Page »