Since my arrival at the ONS 6 months ago, I have received a good deal of training on the sort of stuff you’d expect at a big, responsible organisation – anti-bullying and harassment training, equality and diversity training, health and safety, responsibility with information, and a raft of ‘digital awareness’ modules. I think this training is of real value but a lot of people might see the legal requirement for this as an undue burden on business, especially for small to medium sized enterprises (SMEs). I would argue that government bodies like the ONS can maximise the value of their work and reduce some of the perceived burden on SMEs by applying Open Data philosophy to all resources pushing beyond the common misunderstanding that ‘open data’ is just the information that can be found in spreadsheets.
Open Data Day 2016

Open Data Day has been again! Hundreds of events with thousands of attendees happened over 6 continents – what a community of developers, hackers, data wranglers and designers there are out there: talk about the Digital Revolution! I was lucky enough to attend the London event and take part in an excellent project to do with the gender of London’s street names.

The project was all the more interesting because it was based on another project by hackers from Montevideo in Uruguay. They had collected their city’s street names from Open Data sources and then used a system called Genderize and a lot of manual curation to identify all the streets named after women. They’d then plotted this on a map on their project site, A-tu-nombre.

We decided to do the same thing for London. It was interesting to see how the same project was approached differently by us. Our assumption was that this was a project intended to highlight gender disparity and so we were concerned with plotting men Vs women on our map. However a big part of the focus in Uruguay had been to highlight the women and link to their Wikipedia page so people could learn more about them, learning about cultural history and a bit less adversarial.

Other differences became obvious in the challenge itself, for example, street naming in Montevideo often uses the full name of the person whereas in the UK we tend to use a surname or title and it’s much harder to automate the identification – this meant we didn’t bother with automated links to Wikipedia and just stuck with war of the sexes (see how that looked at the end).
This is user engagement

Getting stuck in at a hackathon was a great way to build relationships with developers and Open Data users that wouldn’t normally fall into our ‘User Experience’ surveys and seminars as well as to build relationships with obvious groups like Open Knowledge. I was impressed to be working alongside local council employees and after discovering they have lots of opinions on ONS Open Data I’ll be going to visit them to hear the experience of their whole team.

Another exciting hookup was with Data Campfire who are prototyping a platform that lets data users promote their projects and link to the publishers of the data they’ve used. It’ll be so much easier to learn from our wider data users if we can get a ping from that platform whenever someone posts a new use of our data.

Perhaps the best linkup was with the original team from Uruguay who were at their own Open Data Day event and happy to give us pointers and encouragement over the course of the event. Open Data is global and it’s great to have the opportunity to engage with potential users on another continent.
For anyone that’s thinking, ‘but I don’t have the skills to go to one of these things’, I can report that it was a hugely diverse group with bloggers, designers, journalists and activists alongside the obvious programmers and data geeks. You can definitely join in and contribute at an event like this.
See our project
Open Source goes hand in hand with Open Data so check out the gender assignment code over at GitHub. Or check out the CartoDB map of London’s streets with gender.

You might notice that although it’s ‘quite good’, it’s not perfect. Long Acre is considered female for example and we had to manually intervene to stop all the lanes being genderised because Lane is a legitimate name. However, there is a reason the Open mantra is ‘release early, release often’. Rather than sit on the project until the system is perfect – many, many months from now – we can post our code and share our ideas and hopefully inspire the community straight away, just as we were inspired by the team in Uruguay.

Update: Gregor Boyd over at the Data Donkey blog has copied/extended this project for Edinburgh’s streets using a different data source and a different mapping system – check out Edinburgh streets by gender too. If you repeat/extend this project for your neighbourhood, please do comment to let us know!

Open Data is the new oil that fuels society

If you want to be followed on social media, just add #bigdata or #datascience to your posts. These are the buzzwords of the century so far and with the aid of geek chic have brought computer tech to its greatest prominence since #dotcombubble. We’re all going to get rich off Big Data or so the story goes – data is the ‘new oil’ ( or information is the new oil ) and Data Scientist is the ‘sexiest job of the 21st Century‘. These ideas have been endlessly rebutted and reinforced over the last couple of years but regardless of how much might be hype, data is definitely the big thing of the moment.

Arguably the poor cousin of Big Data is Open Data. This is probably because venture capitalists hate the idea of just giving away their IP, USP and other acronyms but also possibly because outside of the nascent hacker communities, not many people get too excited about having machine readable access to bus timetables or waste management data.
And yet, Open Data has been getting a lot of loving attention from governments, especially in the aftermath of the global financial crash and the ubiquitous drive to cut costs via efficiency savings and perhaps even increase economic returns from government assets.
This government sponsored open movement is incredibly timely and important. In part because the Open Source and Open Data movements are really priming the pump of the Data Science industry (or Digital Economy) but it also offers to increase public trust in government, something that appears a lot in the UK Statistics Authority’s Code of Practice . It also promises more globally linked-up monitoring, evaluation and strategising which is surely required for tackling global social challenges like climate change, food security and our ageing populations.
The UK has been at the forefront of Open Data for a few years, only just being pipped to pole-position by Taiwan in this year’s Open Data Index. The Office for National Statistics has been leading that charge and currently has 1213 datasets and over 20 thousand reference tables available via the ONS website – and yet there is so much further to go in opening up and unleashing the full potential of Open Data for “UK Plc” and our society.
I have been in my new role as Open Data Lead at the ONS for 3 weeks so far. It’s still early days but I’ve been excited to see the developments underway – with a new website almost finished beta testing, an API that’s also in late stage beta, and a pilot project for a Linked Data portal/API just kicking off (watch this space).

A big part of my role is community engagement and advocacy and I’ll be hoping to create a dialogue both on this blog and on social media (@bobbledavidson) on how the ONS should be pushing forward with Open Data. What data needs to be released? What format is best? I want to hear from you.

If data is the new oil, Open Data is the oil that fuels society and we need all hands at the pump.