Author Archives: mattjw
I recently gave a talk at the inaugural Cardiff Django Weekend. It was a very successful weekend, and my first experience of a developer conference. The talk itself was on meta-programming in Python, showing some of the powerful built-in features Python has for reflection and (to an extent) self-modification. It's inspired by a virtual rock-paper-scissors competition that a fellow lecturer, Staurt Allen, runs for our first-year undergraduate students.
Abstract as follows:
In this talk we’ll explore Python’s meta-programming capabilities by building a program,GellerBot, that cheats at a virtual rock-paper-scissors tournament. The talk will demonstrate some of the neat (and occasionally awful) meta-programming features of Python, and introduce how Python represents its live execution and how a program can be inspected and manipulated on the fly. The talk is aimed at those with Python experience who are curious about meta-programming but have not explored it in depth.
I compiled and designed an academic genealogy graphic and had it printed and framed as a gift. Here's how I did it. Materials for you to do your own, including scripts and an example design, are available in this repository on GitHub. You'll likely need some familiarity with Python syntax.
Supervisor Family Trees
Supervisor family trees are a fun bit of academic self-indulgence. These are like real family trees, but instead of depicting parent-child associations between individuals, a supervisor family tree depicts supervisor-student associations. Exactly what constitutes supervision is an open question -- the most obvious definition is the supervision of a student's doctoral thesis, but this is a bit too narrow. The current conception of a supervised, research-based PhD degree only originated in the 19th century, but the tradition of academic mentorship is far older (e.g., we could go back at least as far as Socrates and Plato). With a wider definition we can build an academic genealogy that goes back many centuries.
Having recently completed my own PhD, I took the opportunity to explore my own academic ancestry and put together a genealogy that I could give to my supervisors as a gift. (Also, by researching my supervisors' ancestry I'd also be researching my own -- the perfect combination of self-indulgence and altruism!)
These academic family trees are nice because nearly all academics will be able to trace their ancestry back to at least a few notable scientists or mathematicians, in the same way that most western Europeans can trace their familial ancestry back to Charlemagne. Marin Mersenne, Isaac Newton, and Galileo Galilei are all ancestors of mine. In addition to direct ancestors, we can also look at individuals with whom one shares a common ancestor. For example, Alan Turing and Peter Hilton, both code-breakers at Bletchley Park during the Second World War, can be regarded as academic cousins as they both share Oswald Veblen as the supervisor of their respective doctoral supervisors.
The big challenge of compiling a genealogy is of course gathering the history of mentor-student relationships for those involved. Fortunately, the Mathematics Genealogy Project (MGP) has done a lot of the work for us. The MGP has mapped over 175,000 academics and their students. Although it is predominantly focused on mathematicians, it also includes academics who have made contributions in other fields, including physics, computer science, chemistry, and biology. A few people have written scripts and libraries that access this database to build a visualisation of an individual's academic genealogy. The best I've found is David Alber's Geneagrapher, which is written in Python. These scripts, however, only attempt to show an individual's direct ancestors and descendants, not any interesting academics that they may share a common ancestor with.
Including common ancestors in the genealogy is a lot more challenging. The number of individuals that have a shared common ancestor with a typical living academic is going to be huge, resulting in a lot of queries to the MGP and producing an unwieldy visualisation. Instead, we want some way of selecting a few interesting individuals to see if their ancestry can be connected to the person we're building a genealogy for (I'll call this person the focal academic for short) and then building the visualisation around that, possibly culling a few unwanted branches of the tree in the process.
Script-GenealogyMiner directory contains a Python script,
genealogy_miner.py, that crawls the MGP for a given focal academic, attempting to connect him/her to other academics. It's configured through another Python file (specified as a command line argument) that contains configuration options. I've provided an example,
config_turing.py, with Alan Turing as the focal academic. To run this example, download the
Script-GenealogyMiner directory and execute:
python genealogy_miner.py config_turing.py turing.dot
For demonstration purposes, the configuration only has a few seed academics (see
SEED_ID_LIST). Seeds are academics the script will attempt to find shared ancestry with. Crawling can take a while, depending on the number of individuals to be crawled, the number of ancestors they have, and the response time of the MGP servers. The crawl with the 14 example seed academics should take less than four minutes.
The output is a plain-text dot file (
turing.dot) describing the genealogy (as a list of nodes and edges, including some formatting instructions such as text and arrow colours) that can be imported into other applications (I used OmniGraffle) so you can do further design work. dot is a popular graph description format and is fairly well supported. If you have GraphViz installed on your system, you can have it generate a rendering via:
dot -T png turing.dot > turing.png
The GraphViz rendering isn't production-quality -- for the final graphic I imported the dot file into OmniGraffle -- but it's useful when you're tweaking the crawl configuration. It takes a bit of guesswork to determine which academics might be reachable from the focal node. The configuration file allows you to specify a few different features which you'll need to play around with, so having GraphViz on hand to do quick renderings of the resulting genealogy is useful.
I should note that the script builds on Geneagrapher (already included in the
Script-GenealogyMiner directory), which it uses to query the online MGP database.
Taking a look in
config_turing.py shows how we can configure the crawler:
- The focal academic (
ID_FOCAL_NODE): The MGP identifier of the academic for whom we are generating the genealogy. This is given in the Mathematics Genealogy Project URL for a particular academic; e.g., Turing's page ends in
- Prospective connections (
SEED_ID_LIST): A list of individuals in the MGP, again identified by their MGP identifier. The script will try to find common ancestry between the focal node and these individuals. So, given the Turing example configuration, the script will look to see if Richard Feynman and Alan Turing have a common ancestor, and if so, it will include both their ancestries in the genealogy.
- Tree pruning (
ERASE_INDIVIDUAL): Including a particular academic can introduce a large ancestry and produce an ungainly genealogy. These two parameters (the cull list and the erasure list) allow us to prune the tree. Culling (
CULL_AND_ABOVE) will remove an individual and his/her entire ancestry. Erasure (
ERASE_INDIVIDUAL) will remove a particular individual but leave his/her ancestors untouched. Culling (as opposed to erasure) an individual will also insert an ellipsis above its children nodes to indicate that part of the tree was removed there.
- Colour scheme: Colour individuals in the genealogy based on their relationship with the focal academic. This includes colouring based on whether the individual shares a common ancestor, is a direct ancestor, is a direct descendant, and so on. Colour instructions are included in the output dot file; most applications (e.g., graphviz and OmniGraffle) should be able to interpret these instructions.
You'll likely need a bit of familiarity with Python syntax to get the most out of the script. Loading a Python module for configuration is a bit of a taboo, but is convenient enough for what this script needs to do. I've used networkx to make manipulating the genealogy (which is, more formally, a directed acyclic graph) simpler, since the script needs to handle joining and splitting of subgraphs, culling of disconnected components, and do some traversals for node colouring.
The script also takes a list of scientific prize winners. If any of these appear in the genealogy, they will be given a special colour, as per the colour scheme. Which prizes you wish to include is up to you. The
Script-FindPrizeWinners directory contains a crude script that will compare the names of academics (which can be copy and pasted from a dot file for convenience) to prize winners and return any matches, so you can figure out who in your genealogy is a winner. I've included a few text files containing lists of winners (up to 2013) for various scientific prizes; namely, Abel, Cole, Fields, Turing, and Wolf prizes. It does fuzzy string matching, since the Wikipedia lists of winners (from where the names are sourced) might have slightly different spellings to those in the MGP, so it will likely produce false-positives -- please use as a starting point only.
After a few iterations of generating a dot file, checking its graphviz rendering, and tweaking the original configuration (e.g., adding more seeds, culling unwanted subtrees, erasing some nodes, etc.) I went on to import the file into OmniGraffle to do a prettier design. For anyone that wants somewhere to start, I've included the final design for one of my PhD supervisors, Roger Whitaker (who, interestingly, connects to my other supervisor, Stuart Allen, through William Hopkins), in the
Designs directory. It's in A-paper ratio () but will need resizing to whatever print size is required. I had it printed on glossy A3 paper and put it in this John Lewis picture frame.
(N.b.: Kudos to Stuart for kicking off this idea by stumbling on one of the older MGP genealogy scripts.)
One of the toys that the Computing Club has to play with is an AR.Drone 1.0. This is a pre-built WiFi-enabled quadrocopter manufactured by Parrot. There are official iOS and Android applications for remotely controlling the quadrocopter. The AR.Drone also streams a live video feed from its onboard camera to the controller. Flying the drone around from an app is fun enough, but where things get really interesting for the Computing Club is programming it to do things! Over the last few months undergraduates have been tinkering with the drone, making it do various things using the open-source javadrone API.
Kirill Sidorov and I, organisers of the Computing Club this academic year, were asked to prepare a demo for an upcoming School of Computer Science & Informatics Open Day. The aim of these open days is to enthuse A-Level students who are considering study in Computer Science. We needed something that was interactive and fun, but also allowed us to highlight some of the concepts of computer science and what makes it interesting. We decided on a motion-tracking AR.Drone demo. We'd use the on-board camera to have the drone follow an individual holding a target. There's some neat computer science here – control and computer vision in particular – and it also demonstrates the power using software to program real-world devices. Furthermore, it also meant we could build on the work done by Computing Club students and bring them in to chat to visitors at the Open Day.
Conveniently, a few days before the the first Open Day (17 April) was the two-day "Open Sauce" Hackathon. Kirill and I were attending anyway to help with the student-organised event, so we took advantage of the fruitful combination of hackathon ambience, energy drinks, and free food to build the demo over those two days. The repository is hosted on GitHub. The original output from the Hackathon is in this branch (warning: gnarled, hackathon-quality code). This was tweaked and (slightly) refactored over the following days in preparation for the Open Day, resulting in this.
The target we used during the hackathon was a ping-pong paddle wrapped in an A4 sheet of paper coloured with pink highlighter. In hindsight, the lighting conditions of the venue were very consistent, making it a favourable test environment. Kirill prototyped some image-by-image video processing to extract the target in MATLAB, and then translated to native Java. I handled the interaction with the AR.Drone and control loop. We also implemented a fairly crude but useful GUI to view the raw and processed image streams, debug some control parameters, and initiate take-off and landing (emergency, typically). The javadrone API made controlling the drone straightforward, and even allowed us to implement some nifty features like changing the drone's LED colours when the target is lost.
The image component outputs the location (a pixel coordinate) and extent (a measure proportional to the target's size in view) of the target in the camera's view. This information is used to handle our three control variables:
- Forward/back tilt for moving forwards and backwards to maintain a particular distance from the target.
- Left/right rotation to keep the target horizontally centred.
- Vertical ascent/descent to keep the camera and target at the same height.
We didn't have much time to fully explore the handling of the drone with respect to these control variables, but experimenting with a few simple linear controllers and a PID or two resulted in decent tracking, as undergraduate George Sale demonstrates in this video:
(As shown in the video, as well as this other one, pretty much every flight ended up with a haywire drone and me initiating a forced landing.)
That was the hackathon; the Open Day proved much more challenging. In our hackathon experiments, the specificity of our target detection was excellent. Specificity was our primary concern, since a false-positive target detection puts bystanders wearing unfortunately coloured clothing on the receiving end of multi-bladed drone fury. The Open Day venue had very uneven lighting, with patchy artificial lights, and a large window in one corner that would temporarily flood the camera depending on the drone's angle. This caused the colour profile of the paddle to change drastically depending on the angle of the drone, the location of the target, and the location of the drone.
To deal with this, our first trick was to change the target. Significant variation in light reflection between dimly lit and brightly lit areas meant large changes in the target's brightness and hue. By switching to a backlit target we could ensure fairly consistent brightness, irrespective of ambient light. Using a bike light, a home-made filter (highlighted A4 paper), a diffuser (coffee filter paper), and filter assembly (polystyrene cup), we hacked together the following target:
(Yes, we effectively built a cheap Playstation Move controller.)
The resulting target had very consistent and distinct appearance. After this there were just a few camera-related issues to tackle; in particular:
- Although the camera resolution is 640x480, the drone only streams 320x240 back to the laptop. Nothing much to say here, except it's surprising (802.11g is capable of the bandwidth and latency) and inconvenient.
- Either the camera hardware or drone firmware was doing some unwanted brightness auto-adjustment which we had to un-adjust back on the laptop.
- The lens quality is poor. We had to discard everything outside a centre 320px-wide circle to cull corner artefacts.
And, then, finally, we were left with a superb signal and negligible false-positive rate.
The control still needs a lot of work, but the drone flies and reacts well. It's enjoyable watching people have a go at it. Initially people are very tentative. This is unsurprising; the drone's forward/back lunging can be vicious at first (although it usually stabilises before quite reaching the volunteer). After a few goes, they're eventually able to start taking it on tours around the demo area, almost like walking a dog; albeit a dog that is noisier, less behaved, and hovering in mid air.
Last weekend's "Open Sauce" Hackathon was a big success. In addition to the funding I mentioned in my previous post, GitHub also got in touch a day before the event to bolster each prize category with one-year bronze and silver accounts.
There's a write-up and more photos at the CSCF website, so please navigate there for more information. I'll also maintain a list of other individuals' posts below.
While at the event Kirill Sidorov and I, Computer Club co-organisers, also took the opportunity to write the software for a motion-tracking quadrocopter demo we'd been asked to for the School's upcoming Open Day. Write-up to follow on this blog.
Thanks to all the judges, undergraduate organisers, sponsors, and attendees for making it a great event!
Last year I attended the inaugural School of Computer Science & Informatics "Open Sauce" Hackathon as a participant. It was a hugely successful event, and good fun to work with Mark and Chris in building Motion Kitty Pi, a prototype Spotify home music streaming service for Raspberry Pi (with motion-triggered playback!). Not only was the event a success, it was superbly organised by undergraduates in the School of Computer Science & Informatics's Computer Club. Click here for a report on last year's event.
Now being a lecturer and co-runner of the Computer Club I get to assist the undergraduates in organising this year's Hackathon, and it's shaping up to be even better than last year's! They've done an excellent job of organising and promoting the event, with over 40 attendees already registered. Among these are undergraduate students from Cardiff University and other institutions, PhD students, staff members, and local professionals.
As with last year the School is supporting the event with facilities and a contribution to the prize fund. What makes this year even more impressive is the amount of external sponsorship the students have secured. Box UK are very kindly providing the ever-important food and (energy) drinks for the two-day event, and a total £500-worth of prizes are being contributed from Linode, DigiStump, and eysys. On top of that, John Greenaway (Cardiff University Information Services), Richard Gaywood, Stuart Allen (Cardiff University School of CS&I), and Humphrey Sheil (eysys) will be on-hand to judge the final projects.
So: free food, free drink, big prizes, and, importantly, building something cool with friends. What more could you want in an event? Get more information or sign up if you haven't! And well done to Joe, Henry, Geraint, James, and all the organisers!
I'm co-organising the "DigiSocial" hackathon – an exciting attempt to adapt the hackathon format to scientific research and bring together a number of Cardiff University schools, including Social Sciences and Computer Science & Informatics, to stimulate interdisciplinary research. The event is targeted at postgraduate researchers at Cardiff University and is receiving primary funding from the Cardiff University Graduate College's (UGC) postgraduate interdisciplinary initiative, with additional support from the School of Social Sciences (SocSci) and School of Computer Science & Informatics (CS&I).
Our motivation for organising this hackathon is the bring together the complementary skills of Social Scientists and Computer Sciences to carry out research that lies at the intersection of these two disciplines. We do not restrict ourselves to participants from only these two schools, however, as there are many researchers from other schools, such as Psychology and Journalism, that have an interest in this area of research.
The general workflow we anticipate – formulation, experiment, and analysis – requires a slightly different format to a typical hackathon. We want to allow the option for teams to collect new data from their own novel experiments, which may require more time than is available at a typical one- or two-day hackathon. Thus, we're splitting the hackathon over two weekends, separated by two weeks. On the first weekend the attendees form interdisciplinary teams and devise projects to work on. If this collaboration were a TCP connection, this would be the establish phase. Most of the implementation happens over these first two days.
Then, the interdisciplinary connection is maintained for the intervening 10 days, with the aim of teams collecting their data. The data collection method depends on the research question, and could, for example, be a publicly accessible web experiment. The teardown phase on the final weekend is for teams to analyse and evaluate their data and experiments, and then present their idea and findings on the final afternoon.
The establish weekend is 15th - 16th (Sept), and the teardown weekend is 29th - 30th (Sept) . More information and a registration form can be found at the hackathon web page. Will Webberley and Wil Chivers are the lead representatives from CS&I and SocSci (respectively), and Chris Gwilliams and I are co-organising from the CS&I side.
Guest lecture given to "Problem Solving with Python" first-year undergraduate students. [Slides PDF 1MB]
I spent last Sunday at Box UK's "For the Social Good" hackathon. It was a very successful event and big thanks to Box UK for putting it on and providing a venue, food, and prizes. Over five teams hacked together apps on the broad theme of "social good" (something of benefit to the local community) in eight hours. Check out this post on Box UK's blog for more information.
Mark Greenwood, Martin Chorley, and I formed the "Cardiff University PhD students" team and built 'Gritly', a winter road condition maps mashup (more information below). Cardiff University did very well at the event, with Computer Science undergraduates winning runner up team and individual hacker prizes, and our own Gritly winning the first prize! It was great to see the apps everyone had built. Here are a few write-ups from elsewhere:
- John Greenaway was one of the judges and covers the whole event.
- Henry Hoggard talks about his team's app, 'NoteSlide'.
- Craig Marvelley on building 'Explore Cardiff'.
Mark, Martin, and I decided to put a team in and spent an hour last week brainstorming some ideas for a project. We'd decided on using a data.gov.uk dataset in some way and needed an idea for something that could benefit the local community. I'd had the phrase "Winter is Coming" rattling around in my head for a while because, well, Winter is coming. Also, and more relevantly, it's the time of year where UK news outlets do the usual winter weather doomforecasting. (OK, Game of Thrones too.) From this we edged towards the idea of trying to manage winter weather road hazards (i.e., ice and snow) by either notifying the council of areas that need more gritting (hence the name 'Gritly') or at least warning drivers of roads that may be hazardous during very cold spells.
We found a very fine-grained data.gov.uk dataset of a few years' road traffic accidents which includes information on the road and weather conditions at time of incident, and so decided on using this to build a Google Maps mashup that would plot the locations of road accidents where ice or snow were factors. The dataset also provides the date and severity of an incident (from 'low' up to 'fatal'), which the webapp can display. Since the data.gov.uk data on accidents is only provided annually we wanted to also include a realtime component. For this we took inspiration from the #UKSnow twitter mashup and decided that people could submit current road hazards by tweeting a post code and warning along with the hashtag #UKIce.
On the day, Mark handled the Twitter realtime component, Martin designed and implemented the web frontend, and I built the backend and data endpoints and handled deployment. I also did minor processing on the accidents dataset to extract the cold-weather related events and prepare it to be served. The code is available on GitHub here. The backend is written in Python and runs on Django. The realtime component uses tweepy to grab the relevant tweets and geopy to geocode the post codes. The front end is bootstrap, Google's maps JS library, and a bit of jQuery.
You can visit Gritly at http://gritly.nomovingparts.net/.
It was great fun to build and we managed the project very well, finishing with a relieving 30 minutes to spare. We were also pleased to take the first-place group prize at the event -- thanks to Box UK for the Amazon vouchers!
Having attended and organised a few hackathons recently, I've been impressed by their ability to glue together the local tech community, university students, and local companies. Hackathons work best when they have some sponsorship behind them; this lends some status to the event and, of course, cash to incentivise participants with food and prizes. As a company, you may be approached by a third party organising a hackathon. Here are some tips on how to get the most out of the event. You may even wish to organise a public hackathon yourself; the same applies.
Here's what you do as an employer. First do your due diligence -- check the organisers are legit, how many people they can genuinely pull in, whether they've secured an appropriate venue, and so on. If all is well, you then put some money into the hackathon to support it in some way. This could be covering the catering (e.g., lunch on each day), sponsoring a prize, or simply handing over an amount for the organisers to use as they want. (A note to newbie organisers here: sponsors will really appreciate a mention on the web page for your event. If you don't have a web page yet, make one! It's a good idea to ask how they want their name and/or logo to appear.)
Go to the event and bring a few employees; if possible, developers would be best! Put a few flyers for your company around the venue. See if you can even convince some employees to get stuck in and participate. For those not participating: interact with the groups, find out what they're doing, how they're doing it, what technologies they're familiar with, and so on. If you're sponsoring a prize, then you'll likely also be acting as a judge. This gives you yet more opportunities to quiz the teams and find out more about the members, and a chance to highlight their achievements at the end.
It's important to not make it a recruiting drive. The participants' primary focus is their projects; they'll likely become alienated by heavy-handed recruiting. They already know you're a company and possibly looking to hire some talented developers, so no need to remind them of it.
At the end-of-event round-up, briefly mention your company, what it does, and how/where to find out about job opportunities there. Being a sponsor of a prize helps here since you'll naturally have the floor for the moment while you announce the winners.
There are many benefits to sponsoring a hackathon, for both company and participant. First, you get a feel for the prospective employees. You've seen them working on a practical project over two days. You have an impression of their personality, how they work in a group, and their experience with various technologies. In fact, at the end of the event, for some of the participants you'll already have answers to the behavioural questions that might be asked during an interview. This is especially true if you (or your employees) participated in a few teams.
Your company's profile will be raised, and with key people in the local tech community. Typically, hackathon participants are very active in tech. They keep up with trends, they love Twitter, and, importantly for you, are well connected in the community. Even if you don't get a direct hire out of the hackathon, world will spread among the acquaintances and friends of participants.
For participants, they'll (hopefully) leave the event with a positive, lasting impression of your company, your employees, and the culture within your company. Traditional advertising will not buy you that level of engagement. (Participants will also, of course, be very grateful for the free food and/or prizes.) The organisers will be very appreciative, and will gladly mention your contribution to the event.