OpenUp Blog

Who has suffered the most in the credit crunch? Visualisations that tell the real story for businesses based on the London Gazette official notices

Inspired by the idea of Open Linked Data, the latest demo of the Gazettes’ faceted search released by the TSO Semantic Team fully exploits the Gazettes semantic data and uses semantic technology to answer very real business questions. The proof-of-concept integrates three data sources, published by different organisations and uses five star quality, uniformed format, Linked Data.

TSO has published the London Gazette notice metadata since October 2010. Hosted on TSO’s OpenUp® Platform, the Gazettes Resource Description Framework (RDF) repository provides access to over 120 thousand notices as Linked Data – translating into an impressive 10 million RDF triples, and this continues to grow every day.

The TSO team mashes up the Gazettes notice Linked Data together with Ordnance Survey (OS) geographical information and Companies House data. OS Linked Data such as UK administrative boundary and postcode data is stored in TSO’s OpenUp® Platform and the SPARQL endpoint is made available for public access. Companies House makes all UK company registrations available online via Linked Data Application Programming Interfaces (APIs).This joins the data mix and integrates with company profiles such as ‘nature of business’ to provide an impressive depth of information to trawl.

Powered by Apache Solr, the Gazette faceted search supports both full-text search of the notice content and real-time facet views to cover:
•    Geographic area  

November 28, 2012, 12:12 pm

The Organograms Linked Data project

The Vision

To provide a customised organisation chart (organogram) visualisation for human users with the capability to download machine-readable data via a Linked Data API and by SPARQL Query from an RDF store.

To meet new requirements for transparency, financial information such as reporting structures, salary bands and the combined salaries of direct reports would need to be provided.

The Linked Data output will gain value over time, enabling users to see the changes to the machinery of government as departments and their responsibilities evolve.

The Solution

The new system was devised by John Sheridan of The National Archives and Jeni Tennison, working as a consultant to TSO with a superb visualisation by Dan Smith. It went live in June 2011.

The system sits upon TSO’s OpenUp® platform services, including:

  • Harvesting – setting up a pipeline to process good quality department data
  • Enriching – extracting RDF
  • Storage – ensuring RDF is stored and available
  • Publishing – making the data available on the web through both an API (to support data re-use) and a visualisation

The Challenges

It was a complex project, the first occasion on which every government department has published RDF.
There were technical challenges to:

November 21, 2012, 4:21 pm

A new version of Flint, the free, feature-rich SPARQL editor, is now available

Our work to improve the experience of accessing semantic data through SPARQL has recently taken another step forwards with a major upgrade to Flint – our web-based open source SPARQL editor. 

Since we released Flint version 0.5 back in June 2011, the most requested feature was support for SPARQL1.1.  Flint version 1.0 now includes support for the current SPARQL 1.1 specification, making it a good choice for accessing triplestores, whether they use the 1.0 or 1.1 version of the SPARQL specification. 

Flint’s context-sensitive help provides great support to deal with the increased sophistication of SPARQL1.1 syntax. If you’ve got an update endpoint allowing requests to insert and delete triples, Flint 1.0 can now help there too, with a new separate mode for SPARQL1.1 update syntax.

Going beyond many existing web-based SPARQL query building tools, which lack abilities such as context-dependent autocomplete or syntax checking, Flint encompasses many of the features developers would expect of traditional Integrated Development Environments (IDEs) and code parsers, through its web implementation.

As in previous versions, Flint 1.0 uses both the syntactic context (what’s expected at the cursor position) as well as what’s in the actual dataset, to provide relevant help. In addition to autocomplete on properties, the new version also provides autocomplete where a class is expected.

November 5, 2012, 4:03 pm

Toward automating complex legislation updates

One of the projects we’re working on at the moment is for The National Archives (TNA) and relates to updating the system for consolidation of legislation – the Expert Participation Model (known as Participation). It’s a big, complex project using many cutting edge technologies and techniques. We thought we would start to share some of the workings.

One aspect of the editorial process has always been very manual. That is, identifying the changes to existing legislation that occur as part of new legislation. It is very common for new UK legislation to change existing legislation, often in very complex and subtle ways. Processing these changes is a very time consuming process. In order to improve the (presently manual) processing one of the two main strands of activity within Participation is to automate the extraction of the changes that new legislation makes to existing legislation. New legislation is published every working day so our objective is to process the new documents as they are loaded.

Amendments to legislation break down into various types. The project has so far tackled several areas, with the most complete and complex area being so-called ‘textual amendments’. These are amendments that actually modify the text of other legislation. For instance, it might be something like ‘In section 3 for “vehicle” substitute “automobile”’.

November 4, 2011, 5:21 pm

Consultations on open data

The government is currently running consultations on two important aspects regarding the move towards a ‘data-driven’ economy. One is on open data (http://www.cabinetoffice.gov.uk/resource-library/making-open-data-real-public-consultation) and the other on the proposed Public Data Corporation (http://discuss.bis.gov.uk/pdc/).

The closing date for both of these consultations is now only 9 days away.

TSO has been at the centre of the drive to open up government data and our Senior Content Analyst, Paul Appleby, took part in a recent workshop at the Cabinet Office to discuss the Open Data consultation. It was a very interesting and enlightening few hours and many different viewpoints were put forward, ranging from straightforward transparency demands (or simply publish the data) to the need for a more sustainable approach to support re-publishing opportunities (where things such as return on investment and SLAs become important factors). Events such as this surface the incredible passion that certain people have about opening up government whilst reinforcing the benefits, both in terms of cost saving and economic growth, that may also be unlocked.

Whatever your view the workshop did highlight the many challenges that the government faces and it is great that the opportunity to respond to the consultations is available.

October 18, 2011, 4:51 pm

Semantic Tech and Business Conference 2011 - highlighting the benefits of semantic technology

TSO was a gold sponsor at the first London Semantic Tech and Business Conference on Monday and Tuesday this week (26 and 27 September 2011). TSO's Richard Goodwin spoke on sustainable semantic publishing. Here Richard gives his thoughts on the conference.

What has become clear to us is that the semantic web community is still at a very early stage in terms of getting its message across to those within large organisations who have the influence and budget to have a major impact on how business is done. Exhibitors and visitors to Semtech were roughly segmented into major publishing organisations such as Pearson, the Press Association, the BBC and TSO, who are adopting semantic principles and seeing benefits. Then there are defence related organisations (non-UK) who are making initial investigations into what the semantic web can do for them (“other than anti-terrorism” as one representative told me). Finally there are academics and startups which are targeting the provision of very niche semantic services with potentially disruptive technologies.

September 29, 2011, 4:39 pm

TSO at SemTechBiz 2011

We are pleased to announce that we will be attending the first Semantic Tech and Business Conference at the Hotel Russell in London on Monday 26 and Tuesday 27 September 2011.

TSO is a gold sponsor at the event and our semantic web team will be available throughout the two days to talk about TSO’s solutions. If you are attending the event come along to the TSO stand to find out how we can help you to unlock the value in your data.

On the first day of the event, TSO will present a session on Sustainable Semantic Publishing. We will discuss the benefits of creating a process that automates the capture, transformation and publishing of data in an efficient, seamless and reliable way and talk about the tools TSO has developed to help with this.

Visit the SemTech website to view the conference programme and register for the event

You can follow the event on Twitter #SemTechBiz

September 21, 2011, 3:39 pm

"The most ambitious open data agenda of any government in the world"

Today’s announcement from the Government that it will publish more open data has been widely welcomed. http://data.gov.uk/blog/prime-minister-commits-to-new-open-data-on-health-schools-courts-and-transport

But for government departments the requirement to publish more data and more frequently brings about more challenges.

In particular the PM today, in his letter to cabinet ministers http://www.number10.gov.uk/news/statements-and-articles/2011/07/letter-to-cabinet-ministers-on-transparency-and-open-data-65383 highlighted again the need for data to be published in an open standardised format to make it freely re-usable. He also set out that it needs to be updated on a regular basis. And the new data publishing commitments have been set tight deadlines.

Creating a sustainable open data publishing process – a process that enables regular updates of fine-grained data in a range of data formats – is not easy to achieve. Capturing data in a structured way, or harvesting data from existing sources so that it can be transformed into re-usable formats, is just the start. But getting it right from the beginning will make it so much easier to publish that data in the range of formats that developers want.

July 8, 2011, 10:46 am

Government Organograms

Over the last week or so organograms of UK government departments and agencies have been published online in a consistent format for the first time.

You can get more information and a list of organograms at data.gov.uk

TSO is hosting the organograms data and web applications and we wanted to provide a little bit more information about how we do it - using our OpenUp platform

The organogram data is stored as RDF. This RDF is harvested from various locations on the web (or locally) using our harvester component. This component is capable of retrieving and, if necessary, transforming data. The data is loaded into our RDF store.

The RDF is then used to drive the organograms visualisation, which is a browser-based application. It gets its data by calling the Linked Data API, a tool that makes it easier for web developers to access RDF data for use in web applications by providing a variety of results formats, simply by making a request to a URI. The API and the organograms configuration are also hosted on our platform servers. Calls to the API are converted into queries against the RDF store and the results returned in the requested format.

In addition a SPARQL endpoint is also available for those developers that want to query the RDF directly.

June 24, 2011, 3:19 pm

Flint improvements and source code release

Here at TSO, we’ve been doing more work on tool support to make it easier to write SPARQL queries. We have now released version 0.5 of Flint, our browser-based SPARQL editor. The syntax checking in this new release is powered by a new parser which we have re-written from the ground up to be more closely integrated into the editor (it’s also a good deal smaller than the old one). This has enabled us to make the ctrl-space code auto-completion and the panel of SPARQL keyword buttons sensitive to the context at the cursor position. The SPARQL keyword buttons will only be enabled for keywords which are valid in the context. Pressing ctrl-space when a property is expected will allow properties to be selected from those used in the dataset (first select a namespace prefix or type ‘<’ to start a URI).

 
For those who wish to embed a Flint editor on their own web pages, Flint 0.5 is being bundled as a download with instructions on how to configure and deploy it. This is released under an open license in the hope that people will find it useful.
 
Flint is still under development. Please check back for future releases. If you have any comments, suggestions or bug reports, please send them to opendata@tso.co.uk.
June 24, 2011, 9:27 am