Author Archives: mattjw

Outreach Talk at Birmingham Pint of Science Festival

Posted by mattjw in news - (Comments Off on Outreach Talk at Birmingham Pint of Science Festival)

I will be giving a talk at the Tackling Epidemics Face On evening that's part of the Birmingham Pint of Science Festival. The talk will be on Tuesday 19th May at the Jekyll & Hyde. Find out more via this post on the University of Birmingham blog, which I've co-written with the evening's co-presenters.

Update: The epidemics evening is now sold out! Please keep an eye out for next year's festival.

ICWSM 2015 paper accepted – Privacy and the city: User identification and location semantics in location-based social networks

Posted by mattjw in news - (Comments Off on ICWSM 2015 paper accepted – Privacy and the city: User identification and location semantics in location-based social networks)

Our paper with Luca Rossi, Christoph Stich, and Mirco Musolesi has been accepted at ICWSM 2015. PDF to follow. See Research page.

Cardiff NHS Hack Day 2015

Posted by mattjw in Uncategorized - (Comments Off on Cardiff NHS Hack Day 2015)

Team HEW at work. Credit: Paul Clarke.

This weekend I'm building a health data visualisation webapp at the Cardiff NHS hackathon! It was put together with Will Webberley, Martin Chorley, Glyn Mottershead, and a bunch of talented Cardiff Computational Journalism students! The students have also been live blogging throughout the weekend.

Visit the working webapp here. Find the code and data on GitHub.

Coverage elsewhere:

  • Post on the Cardiff Computational Journalism blog.
  • Post by Dyfrig Williams from the Welsh Audit Office's Good Practice Exchange .
  • Photos by Paul ClarkeSat and Sun.
  • Cardiff Computational Journalism live blog.
  • Martin's blog post.

Attending NHS Hack Day Cardiff

Posted by mattjw in news - (Comments Off on Attending NHS Hack Day Cardiff)

I'll be at this year's NHS Hack Day Cardiff to build something cool with Martin ChorleyWill Webberley, and some Computational Journalism students.

NHS Hack Day Cardiff
Sat 24-25 January 2015
[Website] [Registration]



LASAGNE Multilayer Network Translation Framework (LMTF)

Posted by mattjw in talks - (Comments Off on LASAGNE Multilayer Network Translation Framework (LMTF))

Presentation of a webapp (the LMTF) I developed with LASAGNE research project partners. The LMTF allows translation of popular multilayer network datasets into the LASAGNE data format.

[Slides on SpeakerDeck]

Program Committee ICWSM 2015 (Oxford, UK)

Posted by mattjw in news - (Comments Off on Program Committee ICWSM 2015 (Oxford, UK))

I am a member of the Technical Program Committee for the next ICWSM. Please consider submitting.

ICWSM 2015
Oxford, UK
Abstracts Due: January 18, 2015
Full Papers Due: January 23, 2015
Main Conference: May 26-29, 2015

Attending SINS 2014

Posted by mattjw in news - (Comments Off on Attending SINS 2014)

I'll be attending SINS'14 (Social Impact through Network Science) at Lake Como, Italy on 14-17 October 2014.

Attending Workshop on Computational Models of Social Interaction

Posted by mattjw in news - (Comments Off on Attending Workshop on Computational Models of Social Interaction)

I'll be attending the Workshop on Computational Models of Social Interaction at University of Birmingham. To be held on 9th October 2014.

Talk: "Cheating at rock-paper-scissors – meta-programming in Python" Django Weekend 2014

Posted by mattjw in talks - (Comments Off on Talk: "Cheating at rock-paper-scissors – meta-programming in Python" Django Weekend 2014)

I recently gave a talk at the inaugural Cardiff Django Weekend. It was a very successful weekend, and my first experience of a developer conference.  The talk itself was on meta-programming in Python, showing some of the powerful built-in features Python has for reflection and (to an extent) self-modification. It's inspired by a virtual rock-paper-scissors competition that a fellow lecturer, Staurt Allen, runs for our first-year undergraduate students.

Abstract as follows:
In this talk we’ll explore Python’s meta-programming capabilities by building a program,GellerBot, that cheats at a virtual rock-paper-scissors tournament. The talk will demonstrate some of the neat (and occasionally awful) meta-programming features of Python, and introduce how Python represents its live execution and how a program can be inspected and manipulated on the fly. The talk is aimed at those with Python experience who are curious about meta-programming but have not explored it in depth.

[Slides on SpeakerDeck] [Demo code on Github]

Academic Genealogy

Posted by mattjw in Uncategorized - (Comments Off on Academic Genealogy)

I compiled and designed an academic genealogy graphic and had it printed and framed as a gift. Here's how I did it. Materials for you to do your own, including scripts and an example design, are available in this repository on GitHub. You'll likely need some familiarity with Python syntax.

The framed academic genealogy.

The framed academic genealogy.

Supervisor Family Trees

Supervisor family trees are a fun bit of academic self-indulgence. These are like real family trees, but instead of depicting parent-child associations between individuals, a supervisor family tree depicts supervisor-student associations. Exactly what constitutes supervision is an open question -- the most obvious definition is the supervision of a student's doctoral thesis, but this is a bit too narrow. The current conception of a supervised, research-based PhD degree only originated in the 19th century, but the tradition of academic mentorship is far older (e.g., we could go back at least as far as Socrates and Plato). With a wider definition we can build an academic genealogy that goes back many centuries.

Having recently completed my own PhD, I took the opportunity to explore my own academic ancestry and put together a genealogy that I could give to my supervisors as a gift. (Also, by researching my supervisors' ancestry I'd also be researching my own -- the perfect combination of self-indulgence and altruism!)

Click for PDF of final design.

Design of the genealogy. Full size PDF.

These academic family trees are nice because nearly all academics will be able to trace their ancestry back to at least a few notable scientists or mathematicians, in the same way that most western Europeans can trace their familial ancestry back to Charlemagne. Marin Mersenne, Isaac Newton, and Galileo Galilei are all ancestors of mine. In addition to direct ancestors, we can also look at individuals with whom one shares a common ancestor. For example, Alan Turing and Peter Hilton, both code-breakers at Bletchley Park during the Second World War, can be regarded as academic cousins as they both share Oswald Veblen as the supervisor of their respective doctoral supervisors.

The Data

The big challenge of compiling a genealogy is of course gathering the history of mentor-student relationships for those involved. Fortunately, the Mathematics Genealogy Project (MGP) has done a lot of the work for us. The MGP has mapped over 175,000 academics and their students. Although it is predominantly focused on mathematicians, it also includes academics who have made contributions in other fields, including physics, computer science, chemistry, and biology. A few people have written scripts and libraries that access this database to build a visualisation of an individual's academic genealogy. The best I've found is David Alber's Geneagrapher, which is written in Python. These scripts, however, only attempt to show an individual's direct ancestors and descendants, not any interesting academics that they may share a common ancestor with.

Including common ancestors in the genealogy is a lot more challenging. The number of individuals that have a shared common ancestor with a typical living academic is going to be huge, resulting in a lot of queries to the MGP and producing an unwieldy visualisation. Instead, we want some way of selecting a few interesting individuals to see if their ancestry can be connected to the person we're building a genealogy for (I'll call this person the focal academic for short) and then building the visualisation around that, possibly culling a few unwanted branches of the tree in the process.

Some Scripts

The Script-GenealogyMiner directory contains a Python script,, that crawls the MGP for a given focal academic, attempting to connect him/her to other academics. It's configured through another Python file (specified as a command line argument) that contains configuration options. I've provided an example,, with Alan Turing as the focal academic. To run this example, download the Script-GenealogyMiner directory and execute:


For demonstration purposes, the configuration only has a few seed academics (see SEED_ID_LIST). Seeds are academics the script will attempt to find shared ancestry with. Crawling can take a while, depending on the number of individuals to be crawled, the number of ancestors they have, and the response time of the MGP servers. The crawl with the 14 example seed academics should take less than four minutes.

graphviz rendering of the Turing example with all demo seeds.

GraphViz rendering of the Turing example.

The output is a plain-text dot file ( describing the genealogy (as a list of nodes and edges, including some formatting instructions such as text and arrow colours) that can be imported into other applications (I used OmniGraffle) so you can do further design work. dot is a popular graph description format and is fairly well supported. If you have GraphViz installed on your system, you can have it generate a rendering via:

dot -T png > turing.png

The GraphViz rendering isn't production-quality -- for the final graphic I imported the dot file into OmniGraffle -- but it's useful when you're tweaking the crawl configuration. It takes a bit of guesswork to determine which academics might be reachable from the focal node. The configuration file allows you to specify a few different features which you'll need to play around with, so having GraphViz on hand to do quick renderings of the resulting genealogy is useful.

I should note that the script builds on Geneagrapher (already included in the Script-GenealogyMiner directory), which it uses to query the online MGP database.

Taking a look in shows how we can configure the crawler:

  • The focal academic (ID_FOCAL_NODE): The MGP identifier of the academic for whom we are generating the genealogy. This is given in the Mathematics Genealogy Project URL for a particular academic; e.g., Turing's page ends in ...?id=8014.
  • Prospective connections (SEED_ID_LIST): A list of individuals in the MGP, again identified by their MGP identifier. The script will try to find common ancestry between the focal node and these individuals. So, given the Turing example configuration, the script will look to see if Richard Feynman and Alan Turing have a common ancestor, and if so, it will include both their ancestries in the genealogy.
  • Tree pruning (CULL_AND_ABOVE and ERASE_INDIVIDUAL): Including a particular academic can introduce a large ancestry and produce an ungainly genealogy. These two parameters (the cull list and the erasure list) allow us to prune the tree. Culling (CULL_AND_ABOVE) will remove an individual and his/her entire ancestry. Erasure (ERASE_INDIVIDUAL) will remove a particular individual but leave his/her ancestors untouched. Culling (as opposed to erasure) an individual will also insert an ellipsis above its children nodes to indicate that part of the tree was removed there.
  • Colour scheme: Colour individuals in the genealogy based on their relationship with the focal academic. This includes colouring based on whether the individual shares a common ancestor, is a direct ancestor, is a direct descendant, and so on. Colour instructions are included in the output dot file; most applications (e.g., graphviz and OmniGraffle) should be able to interpret these instructions.

You'll likely need a bit of familiarity with Python syntax to get the most out of the script. Loading a Python module for configuration is a bit of a taboo, but is convenient enough for what this script needs to do. I've used networkx to make manipulating the genealogy (which is, more formally, a directed acyclic graph) simpler, since the script needs to handle joining and splitting of subgraphs, culling of disconnected components, and do some traversals for node colouring. 

The script also takes a list of scientific prize winners. If any of these appear in the genealogy, they will be given a special colour, as per the colour scheme. Which prizes you wish to include is up to you. The Script-FindPrizeWinners directory contains a crude script that will compare the names of academics (which can be copy and pasted from a dot file for convenience) to prize winners and return any matches, so you can figure out who in your genealogy is a winner. I've included a few text files containing lists of winners (up to 2013) for various scientific prizes; namely, Abel, Cole, Fields, Turing, and Wolf prizes. It does fuzzy string matching, since the Wikipedia lists of winners (from where the names are sourced) might have slightly different spellings to those in the MGP, so it will likely produce false-positives -- please use as a starting point only.


After a few iterations of generating a dot file, checking its graphviz rendering, and tweaking the original configuration (e.g., adding more seeds, culling unwanted subtrees, erasing some nodes, etc.) I went on to import the file into OmniGraffle to do a prettier design. For anyone that wants somewhere to start, I've included the final design for one of my PhD supervisors, Roger Whitaker (who, interestingly, connects to my other supervisor, Stuart Allen, through William Hopkins), in the Designs directory. It's in A-paper ratio (1:\sqrt{2}) but will need resizing to whatever print size is required. I had it printed on glossy A3 paper and put it in this John Lewis picture frame.

(N.b.: Kudos to Stuart for kicking off this idea by stumbling on one of the older MGP genealogy scripts.)