NHS Hackday and the ePortfolio Data Liberation Front

I spent this weekend in Liverpool at NHS Hackday. I had no idea what to expect. I had never met anyone there before and only knew a few names from twitter and google groups conversations in the weeks running up to the Hackday. I wasn’t completely sure I knew what a Hackday was.

I was astounded. 

I spend a lot of my life getting frustrated by the slow pace of change and the massive inefficiencies in the way that we work. I want to be freed up to spend time teaching, learning, writing, thinking, talking to patients and providing care. I hate unnecessary paperwork and bureaucracy. I hate meetings that don’t achieve anything.

NHS Hackday was a breath of fresh air. A diverse group of people with totally different backgrounds, most of whom had never met,  got together, discussed problems and solved them. In a weekend!

I will describe here what happened at the Hackday, what our project “The NHS ePortfolio Data Liberation Front” achieved and why it won 2nd place. There is far more info about how it is run, by who, and why on the NHS Hackday site. You can also see an interview with Carl (from OpenHealthcare) on Youtube:

 What follows are my personal impressions.


On Saturday morning, whilst people were registering and getting coffee all those with ideas for projects wrote them on a board. Everyone gathered in the main hall and each idea had 2minutes to pitch. After all the pitches, people gathered around signs indicating each idea, and people formed groups. Then the work began. Groups discussed their vision, their proposed solution, and thrashed out conceptual and technical details. Fuelled by enthusiasm, tea, coffee and wotsits, software developers created things out of thin air (OK, out of data and code, blood, sweat and tears). Health professionals like me, who couldn’t code, were on hand to give context to the projects and point out real-world hurdles, which could then be worked around.

The NHS ePortfolio Data Liberation Front

Our group consisted of me (full of ideas, no understanding of code), Nicolas Tollervy, a developer (a genius with lots of patience and an incredible ability to work round every problem the project presented him with)  and Marcus Baw, (a GP who can code a bit and is a RCGP Health Informatics Group member, who was a great bridge and font of knowledge on NHS informatics issues).

We discussed some of the problems with the current NHS ePortfolio and possible workarounds. Since the code is not open and there is no API this was no simple problem.

We discussed the urgent need for an app to make trainees and trainers lives easier, and make WPBAs educationally valid. Any app would have to be able to get data into the ePortfolio so that a WPBA showed up not just in the personal library section as any random document, but in the WPBA section. With no code and no API this would be a great challenge.

We decided to focus on the fact that my data is locked in a vault in my ePortfolio. Whilst it is in there I can do nothing with it.

I want to liberate it, as I could then do anything I want with it! Ideas include:

  • visualise my achievements and progression
  • present the data in a way that my supervisor can see, understand and give feedback on
  • present the data in a way that makes it clear I have achieved all the competencies required by the JRCPTB for ARCPs and CCT
  • integrate the data into my CV, my online CV, an alternative ePortfolio (mahara, Googlios etc), use it for job applications
  • allow me to take the data with me into another role (progression or change of career path) eg Foundation Trainee –> Emergency medicine ACCS trainee –> GP trainee –> GP (all use different ePortfolio systems)

Not only is there a practical need for this, but the more we talked about it the more I realised that this is bigger than practicalities. It’s a philosophical argument. It’s my data. About me. I want it liberated. I can already download a PDF so clearly no-one disputes the fact that the data is mine and I have a right to it, but a PDF is useless.

@ntoll worked incredibly hard (with breaks for coffee, sandwiches, a trip to the pub and a curry house), came up against many problems and found ways around them all. We modified our plan as we went along, and decided that the best use of our time would be to do a ‘proof of concept’ and focus on a particular data set within the ePortfolio (there’s a lot of data in there, and it’s not organised as logically as you might imagine!). By the time we reached the submission deadline of 12.00 on Sunday we had something to show for our efforts. @ntoll made some finishing touches and we put together a brief presentation.

All 15 projects that had been selected from the pitches presented (a strict 5min and 1min for questions) to a panel of judges including: @MarkPriceDavies (chair), Ian Gilmore, Dr Farath Arshad, Zeinab Abdi, Francis Irving @frabcus, Dan Lynch @MethodDan, and Lilian Wiles. They deliberated and at 17.00 announced the winners.

The Other Projects

You can see more details of the projects on the NHS Hackday site, and get all the code through the wiki and on github, since all projects are open and shared. There were lots of fantastic projects but those that particularly caught my attention were:

  • AskIt (a general purpose question asking android app for any questionnaire you need – Waterlow, MUST score, falls assessment etc. Simple, effective, important!)
  • Making sense of patient comments (data visualisation from sources such as NHS Choices – massive potential applications)
  • CoIncidence Gate: a Conflict of Interest tool (scraped data from conflict of interest statements on Pubmed – something like 480,000 papers analysed!! Again, follow the link for more discussion on the massive potential applications of this project)
  • BleepBleep (making in-hospital communication better. An end to having to call switchboard. An end to the bleep! I trialled this, and am keen to help get it into hospitals now! Stop wasting time on hold)
  • GAAG: Guidelines at a Glance (there are well-studied barriers to doctors using guidelines, meaning patients don’t get best care. GAAG provides quick access to personalised most-used bits of guidelines on an app. Lots of potential for social add-ons, highlighting when guidelines change, seeing what peers use, rating bits of guidelines. See presentation for more info. Can’t wait to use it!)
  • Bloodcount (haematologists sit at very advanced microscopes counting different normal/abnormal cell types using very un-advanced technology = clicker and pen and paper. Bloodcount is a desktop system of a counter with keyboard shortcuts, reference normal and abnormal cells, report generation and learning function. Hard to describe to do it justice. A worthy winner!)
  • wtfdoc (an NHS jargon buster for patients and relatives as an app. Has a database, and if a term is unknown it will crowdsource answer through twitter and other sources. V clever!)

Why I think We Won a Prize

Our project won the First Scraperwiki prize for scraping, and came joint second overall on the day. I think the reasons we won are multiple:

  • @ntoll achieved amazing things writing novel code to scrape data out of a closed system and generate a .json file of hierarchical data that could then be used. In just a day and a half this was some achievement!
  • our pitch was powerful as this is an issue for all doctors of all specialities at all levels, especially with revalidation now a reality. Facilitating learning for healthcare professionals is in all our interests as a society
  • the concept of data liberation goes beyond this project. Who owns the data in public databases? Who owns the data in the NHS? What right does an individual have to their own data? What right does an institution have to keep it from them?

What next

I owe a huge thank you to the organisers, supporters, volunteers and participants at NHS Hackday Liverpool 2012.  And a special thanks to Scraperwiki for providing prizes including my beautiful new Google Nexus 7! This weekend I saw innovation in action, providing real, practical solutions to the day-to-day problems facing those who work in and use the NHS. Some of these solutions are now in use – today! Others will be worked on outside the Hackdays or at the next one. I have had my mind opened to new ways of working and have returned to work today full of enthusiasm and inspiration.

There’s no going back now. I’m a doctor who loves geeks who love the NHS, and I have the T-shirt to prove it. 


17 responses to “NHS Hackday and the ePortfolio Data Liberation Front

  1. Great stuff LJ! For someone with a self professed lack of IT/ technical knowledge you are already talking python, JSON 🙂 . Thank you for capturing the spirit of the event so well. @wai2k

  2. Pingback: Thinking about hackdays « e-LiME

  3. Json is not much use because it’s unstructured data. The NHS ePortfolio does have an API In Leap2A format, it is not public yet but it may be in the future. The leap2A format is a nice idea but I’m not convinced it will be very useful in real world scenarios because it is only an Atom feed. It provides hyperlinks to documents, and these are not accessible unless you are logged into the eportfolio site as the user that owns those documents. The examples are pretty basic http://www.leapspecs.org/2a/examples

    • Are the portfolio team convinced leap won’t be useful, if so, why would they use it? If they do think it is useful what is the justification for not building something simpler and vaguely restful. A restish api and some schema would be much more useful.

  4. Hi Ben,

    I beg to differ. JSON *is* structured data. As http://json.org (the website for the specification) states,

    “JSON is built on two structures: 1) A collection of name/value pairs, 2) An ordered list of values. These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that is interchangeable with programming languages also be based on these structures.”

    In what way is this unstructured..?

    Also, practically *every* programmer worth their salt knows and understands JSON. To my naive eyes, “leap” looks like a domain specific XML dialect that’s about as intuitive as a 747 flight deck but with all the buttons labelled in Sumerian cuneiform. As a result, using this as the basis of an API is, perhaps, not a wise idea. I realise it might be the “industry standard” but it certainly isn’t the most widespread nor easy to use structured data format (I believe that crown belongs to JSON).

    Please, please, please, please, please, please, please, please, please implement a JSON based RESTful API first. JSON is a ubiquitous standard and REST provides a clear and simple route to implementation especially if it’s exposed via HTTP.

    In the future, I look forward to helping Doctors liberate their data with your cooperation via a well documented and simple API.

  5. I thought Ben meant that the portfolio was unstructured data… not JSON.

    But I’m not a techie!

  6. @ntol JSON and Leap2A are different things. Leap2A is a specification for applying semantic markup to ePortfolios. JSON is just a way of representing objects. JSON has no semantics.

    JSON is a bad choice for interoperability. If you want to export data from NHS ePortfolio and import it somewhere else then you need a data specification that both systems understand, Leap2A is the most obvious choice with support from a wide range of ePortfolio systems including PebblePad, Moodle, Mahara etc.

    If NHS ePortfolio users just want to export data for their CV then surely Microsoft Word would be more useful (pretty easy to do).

    If they want to export their data into another ePortfolio system then Leap2A is definitely the way to go.


  7. The motivation for this was described as:

    * visualise my achievements and progression
    * present the data in a way that my supervisor can see, understand and give feedback on
    * present the data in a way that makes it clear I have achieved all the competencies required by the JRCPTB for ARCPs and CCT
    * integrate the data into my CV, my online CV, an alternative ePortfolio (mahara, Googlios etc), use it for job applications
    * allow me to take the data with me into another role (progression or change of career path) eg Foundation Trainee –> Emergency medicine ACCS trainee –> GP trainee –> GP (all use different ePortfolio systems)

    So if we had two new download options:

    Export to Word
    Export to Leap2A

    Would these solve these problems?

  8. Directly comparing Leap2A to JSON is like comparing the English language to the Greek alphabet. As an approximation, Leap2A is to XML as the English language is to the Roman alphabet; JSON is like another “alphabet” and not a “language”.

    (For anyone unfamiliar, JSON data does not have an intrinsic schema the way an XML document does; however if you return JSON with arbitrary undeclared structure, you’re probably doing something sick and wrong. It basically comes down to differing ways and locations of writing schema. Now if you want to talk _Leap2A_ vs “however Nicholas chose to structure his JSON data”, then sure…)

    Fundamentally when data is freely and readily accessible, people can do awesome things with it.

    When data is needlessly locked away, it sucks.

    It sounds like there is no official supported API for getting data out of the ePortfolio easily. This sucks.

    Lots of people seem to think it would be great if there was one. I agree. People could do awesome things with it.

    Now, we want this API to have a decent easy to use format. Ben, you seemed to say at the start that you aren’t convinced that Leap2A was the right thing for this (incidentally I’d be very interested to see a reply to Ross’s comment at https://nhseportfoliorevolution.wordpress.com/2012/09/24/nhs-hackday-and-the-eportfolio-data-liberation-front/comment-page-1/#comment-163)

    If there were a decent official Leap2A API and several people wanted a JSON API, then hey, it doesn’t look terribly hard to write a Leap2A -> JSON converter. This is all moot if no official API exists, which sounds like the current situation.

    Now if this Leap2A API just “provides hyperlinks to documents, and these are not accessible unless you are logged into the eportfolio site as the user that owns those documents” (from Ben’s comment), where, by the sounds of it, most of the information of value is in these documents, well, that doesn’t sound like much of an API, whatever format it uses. (I’m assuming that statement means “documents must be downloaded manually and not via API”)

    Nicholas seems to have done an excellent job building what I like to call a “scrAPI” (scraper-API, geddit?) but obviously that’s not ideal. An official “proper” API would seem best.

    Freeing the data seems the bottleneck. Can we fix that please?

    (Yes personally I’d choose a lightweight JSON API, obviously with clear documentation and structure, as all good JSON APIs have. I’ll take a decent well-structured XML-y API over nothing at all)

  9. Hi Ben,

    There seems to be some confusion. JSON is an *excellent* choice for interoperability and, yes, JSON does have semantics by virtue of the naming of fields and structure of the data. It does not, however, have a top-down specified semantic “ontology” in the way that LEAP may have. Think about this for a second, JSON allows *you*, the developer, to organise and represent the data in a schema that is most appropriate for you. Furthermore, JSON is so widespread a format (whereas LEAP is a specialist XML dialect) that you’d be foolish not to take advantage of the ubiquity of the format.

    You talk about interoperability but appear to assume that your users will only want to “interoperate” with other ePortfolio systems such as yours. Please don’t assume anything about how your users want to use their data. As LJ mentioned in the article above, she’d like to incorporate aspects of her ePortfolio into, say, a CV hosted on Google. JSON is far more friendly for humans to comprehend than the verbose, often incomprehensible and complicated tag soup that is XML. Please reconsider and choose an open, easy to read, simple to adapt and ubiquitous data format that gives your users the freedom to work with their data. While I can see why you think LEAP is the way to go from your perspective within the ePortfolio industry I think you’re missing the point of “data liberation” and the reasons for using open, easy to use formats like JSON.

    Finally, are you trying to make a joke by suggesting users should be able to download their data in Word format..? I suppose LEAP is machine parseable.

    In conclusion, given a choice of an easy to read, open and ubiquitous data format or a complicated, hard to read and “limited to the ePortfolio world” format which one do you think your users would choose..?

  10. There are two different things here, an API (for programmers) and a “Download my ePortfolio” option for Doctors.

    Doctors will not download their eportfolio as JSON because they have no idea what JSON is. If a Doctor downloads a JSON file, their computer will not know how to open that file. JSON is also not compatible with word processors like Microsoft Word or Google Docs and it’s not compatible with other eportfolio systems. So, JSON is useless to users.

    If people wanted to use their eportfolio data in their CV they would probably export it to Word. If they wanted to export their data to another eportfolio system they would be instructed to download it as Leap2A.

    For programmers, there is no such thing as a ‘JSON RESTful API’. A restful API should use a variety of formats to represent resource states as explained in (1)(2) & (3). I intend to make the NHS ePortfolio API fully RESTful so it will incorporate lots of different state representations. Most ‘RESTful APIs are in fact RPC because very few programmers understand REST.

    (1) http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
    (2) “Rest In Practice” by Jim Webber (http://www.amazon.co.uk/REST-Practice-Hypermedia-Systems-Architecture/dp/0596805829).
    (3) “REST APIs must be hypertext-driven” (http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven)

  11. These discussions are really helpful so thanks for all your comments. Ben I think you need to be careful about making assumptions about what doctors would do with their ePortfolio data. We need to look beyond the way we use the ePortfolio now, and see future potential. Right now the only uses may be to put into a CV or use in another ePortfolio, and of course doctors wouldn’t know what to do with a .JSON file, but this misses the point. I am hoping we can look at ways in which we could use the data more imaginatively so that is could be more educationally valuable for the trainee, and more useful for the trainers.

    I envisage getting the data out in the most flexible form, which would open up choice for trainees. We could build platforms that could use the data (in whatever form that may be) to visualise it, and interface with other software to enhance learning. These platforms would mean that understanding code would not be necessary.

    This is not just about what we do today, it’s about imagining what we can do in the future. Trainees want to be encouraged and inspired to excellence. They need us to be imaginative.

    PS. I may not always use the correct terminology (ie platform?) so feel free to correct me!

  12. Ben,

    Three points:

    1) You assume a lot with regard to what your users want and how they want their data delivered. It appears that a significant and vocal (perhaps because they’re fed up?) minority of your users would like access to *their* data stored in *your* system in a way that allows them to extract any or all of their complete assessments and history.

    2) What’s so wrong with allowing users to HTTP GET a specific item in their ePortfolio as an open, simple and ubiquitous data format (like JSON)? It’s the most open and flexible solution since there are no semantic assumptions – I can just parse the resource into a native list or dictionary.

    3) I’m confused by your assertion that, “there is no such thing as a ‘JSON RESTful API’”. Perhaps you’re referring to this blog post..?


    I personally think that the author of this post if barking up the wrong tree. JSON is simply the format of a representation for a resource. What Fielding makes clear is that “Hypermedia” is an essential part of REST in that it allows the user to, “obtains choice[s] and select[s] actions” (to quote the man himself). As the first response to the blog post states, “JSON and XML are just protocols for wrapping data. The XML specification makes no mention of hyperlinks. Fielding has been and continues to be vague about the protocol to use to represent hypertext.”

    You may split hairs and we may not agree on the precise categorisation of an API because it uses JSON as the data protocol – is it REST-ful, REST-like, REST-inspired..? But, this is all pointless bickering.

    The bottom line is:

    * You don’t provide an API. We all agree that this sucks.
    * You appear to make assumptions about your users. This is hard to avoid and I’m as guilty of doing this as any other developer, but evidence from this blog and doctors and consultants on the hackday suggest that they want an open and easy to parse API so developers can collaborate with them to produce wonderful things.
    * You appear to be missing a trick by favouring the rather complicated LEAP dialect when you and I both know that returning JSON from a .NET application is a relatively easy thing to do.

    So the question remains, why don’t you listen to LJ and her colleagues, listen to us developers who have struggled (and mainly failed) to scrape your site and give us what we want? This certainly isn’t a technical problem although I concede it’s likely to be a business/political one (and your hands are unfortunately tied).

    Look, we’re trying to engage with you and help you out. Accepting Carl’s FOI request for the source code would allow us *as a community* to improve the service. Hell, given some source code, a bunch of willing hackers and yourself in attendance I suspect we could probably pull off something hugely worthwhile by tackling just problems that are low hanging fruit. Why not come along to the next hackday in the spring in Oxford..?

    Honestly, they’re a lot of fun and I’ll buy you a beer! Ultimately, we’re all trying to get to an ePortfolio that meets its user’s expectations. Aren’t we..?

  13. A number of points:

    Technically the API would be REST. This means that it will not use a single data format so there is little point discussing that further. 99% of ‘REST’ APIs are actually RPC because they do not use hypermedia anyway.

    I can’t just ‘create an API’. All work has to be paid for by the colleges so until they commission me to do it, it won’t get done.

    Regarding the use of an API (i.e. public access). The colleges will hate that idea because the eportfolio is used for assessment of trainee progress so having an open source app made by some unknown entity, accessing assessments etc and trainee data will not impress them. They are super-conservative about that kind of thing

    You all talk about what users want to do with ‘their’ data. You will find that the colleges that pay for eportfolio actually regard that data as their own, not the trainees. Yes the trainees pay the college, and the colleges pays for eportfolio, but it’s the colleges that control the eportfolios. This is why the colleges are not sympathetic about trainees ‘freeing their data’. The data is used for assessing trainees so again the colleges are not too interested in seeing that data being taken elsewhere.

    If the eportfolio was not used as an assessment tool then a public API and open source client apps would be the obvious choice.

    Regarding an in-house app, this is a minefield again because the cost has to be spread between 20+ colleges and getting them all to agree up front is a nightmare.

    An Agile approach (develop app & api piece by piece) cannot be used because of all the old school finance departments need costs to be agreed and contracts to be signed (these are people that still use fax machines).

    There is no cash for ‘product development work’ inside NES. It all comes from the colleges.

    There is no point just ‘asking’ for an API. There is a very slow moving bureaucratic cloud of red tape and politics to get through first.

    Welcome to my world. Basically there is no point lobbying me, you need to lobby the colleges and educate them about what they should be asking (us) to do, i.e. public API & open source offline HTML5 app.

  14. Pingback: The perfect ePortfolio | The NHS ePortfolio Revolution starts here

  15. Pingback: Open standards and open platforms: ‘play nicely children’ | content revolution blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s