Planet RDF

Subscribe to Planet RDF feed
Updated: 3 days 9 hours ago

A quick Radiodan: Exclusively Archers

Sun, 01/25/2015 - 14:29

Categories:

RDF

I made one of these a few months ago – they’re super simple – but Chris Lynas asked me about it, so I thought I should write it up quickly.

It’s an internet radio that turns itself on for

Big Data Industry News Watch

Fri, 01/23/2015 - 15:00

Categories:

RDF

A round up of recent industry news on the topics of Big Data and Enterprise Data Management

A quick analysis of wifi cards for using a Raspberry Pi as an access point

Fri, 01/23/2015 - 09:37

Categories:

RDF

When Radiodan can’t access the web, it throws up an access point (AP) created by the Pi: you connect directly to that and it displays the available wifi points in a webpage as a captive portal, and asks you to add the password for the one you want. It’s not easy to get credentials for wifi to objects with no user interface, and this is the best one we’ve found so far (

Putting the Smarts in Data Integration

Tue, 01/20/2015 - 20:47

Categories:

RDF

Driving business value from your data often requires integration across many sources. These integration projects can be time consuming, expensive and difficult to manage. Any short cuts can compromise on quality and reuse. In many industries, non-compliance with data governance rules can put you firm’s reputation at risk and expose you to large fines.

Traditional data integration methods require point to point mapping of source and target systems. This effort typically requires a team of both business SMEs and technology professionals. These mappings are time consuming to create and code and errors in the ETL (Extract, Transform, and Load) process require iterative cycles through the process.

Two AKSW Papers at #WWW2015 in Florence, Italy

Tue, 01/20/2015 - 15:09

Categories:

RDF

Hello Community! We are very pleased to announce that two of our papers were accepted for presentation at WWW 2015.  The papers cover novel approaches for Key Discovery while Linking Ontologies and a benchmark framework for entity annotation systems. In more detail, we will present the following papers: Visit us from the 18th to the 22nd May in Florence, Italy and enjoy the talks. More information on these publications at http://aksw.org/Publications. Cheers, Ricardo on behalf of AKSW

R (and SPARQL), part 2

Tue, 01/20/2015 - 13:32

Categories:

RDF
Retrieve data from a SPARQL endpoint, graph it and more, then automate it.

2015 Ontology Summit: Internet of Things: Toward Smart Networked Systems and Societies

Wed, 01/14/2015 - 18:17

Categories:

RDF

The theme of the 2015 Ontology Summit is Internet of Things: Toward Smart Networked Systems and Societies. The Ontology Summit is an annual series of events (first started by Ontolog and NIST in 2006) that involve the ontology community and communities related to each year’s theme.

The 2015 Summit will hold a virtual discourse over the next three months via mailing lists and online panel sessions augmented conference calls. The Summit will culminate in a two-day face-to-face workshop on 13-14 April 2015 in Arlington, VA. The Summit’s goal is to explore how ontologies can play a significant role in the realization of smart networked systems and societies in the Internet of Things.

The Summit’s initial launch session will take place from 12:30pm to 2:00pm EDT on Thursday, January 15th and will include overview presentations from each of the four technical tracks. See the 2015 Ontology Summit for more information, the schedule and details on how to participate in these free an open events.

DC-2015 website and Call for Participation open

Mon, 01/12/2015 - 23:59

Categories:

RDF
2015-01-12, DCMI and the host of DC-2015, Universidade Estadual Paulista--Sáo Paulo State University (UNESP), are pleased to announce the publication of the Call for Participation at http://purl.org/dcevents/dc-2015/cfp and the opening of the DC-2015 website at http://purl.org/dcevents/dc-2015. DC-2015 will take place in Sáo Paulo, Brazil on 1-5 September 2015. Just as DCMI celebrates it's 20th anniversary this year, it also celebrates the first time it's Annual Meeting and International Conference have been located in South America. Watch the conference website at http://purl.org/dcevents/dc-2015 for updates as DCMI and UNESP develop an exciting program around the conference theme of "Metadata and Ubiquitous Access to Culture, Science and Digital Humanities".

Shanghai Library joins DCMI as an Institutional Member

Mon, 01/12/2015 - 23:59

Categories:

RDF
2015-01-12, DCMI is very pleased to announce that the Shanghai Library (http://www.library.sh.cn) has joined DCMI as an Institutional Member. The Shanghai Library is the second largest library in China--second only to the National Library. The Shanghai Library was founded in 1952. In October 1995, the Shanghai Library and the Institute of Scientific and Technical Information of Shanghai merged to become a comprehensive research public library and center for industrial information. It is also the branch of the National Cultural Information Resource Sharing Project in Shanghai, the main library of the Shanghai Central Library System, Shanghai Ancient Books Protection Center and the "Pioneer Technology Development Research Center" of the Shanghai soft science research base. The DCMI Institutional Member Program is open to all public sector organizations interested in supporting DCMI while participating actively in DCMI governance. Please see the DCMI membership page at http://dublincore.org/support/ for more details about DCMI's membership programs.

Open Semantic Framework 3.2 Released

Mon, 01/12/2015 - 18:15

Categories:

RDF
Structured Dynamics is happy to announce the immediate availability of the Open Semantic Framework version 3.2. This is the second important OSF release in a month and a half.

This new major release of OSF changes the way the web services communicate with the triple store. Originally, OSF web services were using a ODBC channel to communicate with the triple store (Virtuoso). This new release uses the SPARQL HTTP endpoints of the triple store to send queries to it. This is the only changes that occurs in this new version, but as you will see bellow, this is a major one.

Why switching to HTTP?

The problem with using ODBC as the primary communication channel between the OSF web services and the triple store is that it was adding a lot of complexity into OSF. Because the UnixODBC drivers that are shipped with Ubuntu had issues with Virtuoso, we had to use the iODBC drivers to make sure that everything was working properly. This situation forced us to recompile PHP5 such that it uses iODBC instead of UnixODBC as the ODBC drivers for PHP5.

This was greatly complexifying the deployment of OSF since we couldn’t use the default PHP5 packages that shipped with Ubuntu, but had to maintain our own ones that were working with iODBC.

The side effect of this is that system administrators couldn’t upgrade their Ubuntu instances normally since PHP5 needed to be upgraded using particular packages created for that purpose.

Now that OSF doesn’t use ODBC to communicate with the triple store, all this complexity goes away since no special handling is now required. All of the default Ubuntu packages can be used like system administrators normally do.

With this new version, the installation and deployment of a OSF instance has been greatly simplified.

Supports New Triple Stores

Another problem with using ODBC is that it was limiting the number of different triple stores that could be used for operating OSF. In fact, people could only use Virtuoso with their OSF instance.

This new release opens new opportunities. OSF still ships with Virtuoso Open Source as its default triple store, however any triple store that has the following characteristics could replace Virtuoso in OSF:

  1. It has a SPARQL HTTP endpoint
  2. It supports SPARQL 1.1 and SPARQL Update 1.1
  3. It supports SPARQL Update queries that can be sent to the SPARQL HTTP endpoint
  4. It supports the SPARQL 1.1 Query Results JSON Format
  5. It supports the SPARQL 1.1 Graph Store HTTP Protocol via a HTTP endpoint (optional, only required by the Datasets Management Tool)
Deploying a new OSF 3.2 Server Using the OSF Installer

OSF 3.2 can easily be deployed on a Ubuntu 14.04 LTS server using the osf-installer application. It can easily be done by executing the following commands in your terminal:

mkdir -p /usr/share/osf-installer/

cd /usr/share/osf-installer/

wget https://raw.github.com/structureddynamics/Open-Semantic-Framework-Installer/3.2/install.sh

chmod 755 install.sh

./install.sh

./osf-installer --install-osf -v Using a Amazon AMI

If you are an Amazon AWS user, you also have access to a free AMI that you can use to create your own OSF instance. The full documentation for using the OSF AMI is available here.

Upgrading Existing Installations

It is not possible to automatically upgrade previous versions of OSF to OSF 3.2. It is possible to upgrade a older instance of OSF to OSF version 3.2, but only manually. If you have this requirement, just let me know and I will write about the upgrade steps that are required to upgrade these instances to OSF version 3.2.

Security

Now that the triple store’s SPARQL HTTP endpoint requires it to be enabled with SPARQL Update rights, it is more important than ever to make sure that the SPARQL HTTP endpoint of the triple store is only available to the OSF web services.

This can be done by properly configuring your firewall or proxy such that only local traffic, or traffic coming from the OSF web service processes, can reach the endpoint.

The SPARQL endpoint that should be exposed to the outside World is OSF’s SPARQL endpoint, which adds an authentication layer above the triple store’s endpoint, and restricts potentially armful SPARQL queries.

Conclusion

This new version of the Open Semantic Framework greatly simplifies its deployment and its maintenance. It also enables other triple stores that exist on the market to be used for OSF instead of Virtuoso Open Source.

Another AGU and we all get wet from the rain in San Fran…

Sun, 01/11/2015 - 00:04

Categories:

RDF

The 2014 Meeting of the American Geophysical Union in the wet city of San Francisco has not yet faded from memory. Unfortunately, it may be remembered for the “year of the RFID mess” over the great science progress. However, let’s start with the positive. Rensselaer’s Tetherless World was well represented – see what we did at http://tw.rpi.edu/web/event/AGU/FM/2014/Participation = Patrick, Stephan, Marshall, Evan and Paulo (representing others including Linyun and Han) in talks, posters covering both research and project progress, and the academic booth (go RPI!). This year, we presented in Informatics (IN) and Education (ED) sessions with talks and many posters. Just on a logistics note, I was very pleased to have the exhibit hall adjoined to one of the poster halls this year. This made the task of moving between them and not missing one or the other, much easier. Hope that continues. It was another excellent year for Informatics; I’ve misplaced the stats but suffice to say increasing numbers of abstracts, great student contributions and a sea of new faces. A continuing treat is the Leptoukh Lecture (honouring Greg L, whom I still miss very much). This year, Dr. Bryan Lawrence (working in the UK, but actually a Kiwi) gave a tour de force lecture on computation and data aspects of climate science. The attendance was excellent, clearly pulling in a wide cross-section of attendees from well beyond the IN folks. Thanks Bryan. This year was the change over for Informatics leadership with Kerstin Lehnert taking over from Michael Piasecki as President – thanks Michael for your leadership and efforts over the last two years. Ruth Duerr (NSIDC) came in as President-Elect and Anne Wilson (CU/LASP) as secretary. Diversity rules in Informatics!!!

In regard to IN poster sessions, we saw an increase in the flash mob approach. What is that you ask? It is where, at an appointed time during the poster session, the session convener arranges for all poster presenters to be present. After having also advertised by twitter, email and general coercion, they gather poster attendees around each poster (in order, down the row). The presenter has 5 minutes to present their poster and then the mob moves on. It has shown to be a very effective way of engaging attendees and the presenters. If the session organiser has pre-planned it, the sequencing can also be very effective. After each has been presented, may attendees stay to quiz specific posters they were interested in. The one aspect that makes this style hard is the general noise level in the poster hall. Poster presenters need to “speak up” and project their voice: not all are prepared for that but it is very good practice!

I am author / co-author on quite a few presentations each year. This year I had two posters (both invited) as lead. You can see them via the link above. Sixth generation of data and information architectures, and Anatomy and Physiology of Data Science drew quite a lot of interest. But I must say, I did enjoy getting to stand with Mark Parsons at our poster “Why Data Citation Misses the Point” (I will add that to the website) and elaborate on our premise. Interestingly, we had a lot of agreement with the work — we’d hope to provoke arguments (!! as usual !!). Now to find time to write that up.

I want to acknowledge the excellent presentation of other works I was co-author on. The TWCers noted above are indeed skilled and knowledgeable researchers and practitioners. I know that but it is always excellent to have peers approach me to tell me that and how impressed they are with both the work and the people!

And the RFID issue – just go here and see for yourselves: http://petitions.moveon.org/sign/say-no-to-rfid-tracking.fb47

See all of you next December.

 

DBpedia Usage Report

Wed, 01/07/2015 - 20:12

Categories:

RDF

We've just published the latest DBpedia Usage Report, covering v3.3 (released July, 2009) to v3.9 (released September, 2013); v3.10 (sometimes called "DBpedia 2014"; released September, 2014) will be included in the next report.

We think you'll find some interesting details in the statistics. There are also some important notes about Virtuoso configuration options and other sneaky technical issues that can surprise you (as they did us!) when exposing an ad-hoc query server to the world.

Spatial Data on the Web WG launched

Tue, 01/06/2015 - 15:01

Categories:

RDF
It was 10 months ago today, 6th March 2014, that the Linking Geospatial Data workshop in London came to an end with Bart De Lathouwer of the OGC and I standing side by side announcing that our two organizations would … Continue reading →

Raspberry Pi podcast-player-in-a-box – step by step

Sat, 01/03/2015 - 16:54

Categories:

RDF
Introduction

Podcast-player-in-a-box is a way to associate a physical object (a plastic card) with a possibly-changing list of audio files. When you put the card in the box it plays the audio.

It’s inspired by

Read Write Web — Q4 Summary — 2014

Thu, 01/01/2015 - 10:47

Categories:

RDF
Summary

The web ponders moving further towards SSL, with the W3C TAG publishing a draft finding on how this could be more easily achieved.  There was a great review by the EFF on progress, as well as some interesting suggestions by timbl.

Linked data continues it’s inexorable march towards the mainstream with steady progress throughout the quarter and whole year.  Some good reviews are available here, here, and here.  With a look forward to what we may see in 2015.  A cool ontology viewer called VOWL also caught the eye.

There were some more discussion regarding the HTTP PATCH verb and how it applies to data, with specs and implementations reaching readiness.  A comprehensive wishlist covering much of the future of LDP and RWW was posted by Sandro, as well as a new authentication system, called SPOT (Simple Page-Owner Token).

Communications and Outreach

Henry Story delivered an outstanding presentation at Scala eXchange conference in London, where he outlined the current state of play of the read write web and decentralized social web.  An overview of the project is available on github, as well as source code.

Some conversations took place in the identity credentials group and open badges  which aims to allow writing of achievements, via badges, on servers, in images and data structures, and using digital signatures.

 

Community Group

The LDP Patch specification is now reaching readiness and I believe the integration into GOLD is going to happen as we speak. GOLD has also now integrated JSON LD.

Community group has added a slack instance, which allows slightly more realtime chat, an API and many other features.

A stub wiki page has been added on the concept of “nanotations“, linking to Kingley’s blog explanation, feel free to add your own examples!

Applications

Some initial work has started on intelligent personal assistants.  Juergen has written a sioc bot which is able to take real time conversations and convert them to linked data.  Leveraging adapters in hubot the code is available on git and was up and running in just a couple of days.

I’ve also been working on a linked data robot that allows simple transfer of credit (aka marking) from one URI to another.  The hope is to build out a linked data based transfer and reputation system.  A slightly related side project I’ve started is virtual wallet, which will allow holding of web currencies and transfer between WebIDs, much more standards work to be done here…

 

Last but not Least…

An interesting system called webhose has been launched.  “The Webhose.io API – Easily integrate data from hundreds of thousands of global online sources: message boards & forums, blogs, comments, reviews, news and more”.  Seems like a neat bridge to pull into your apps news from many web2.0 data sources!

PhD defense: Varish Mulwad — Inferring the Semantics of Tables

Tue, 12/30/2014 - 00:07

Categories:

RDF

Dissertation Defense TABEL — A Domain Independent and Extensible Framework
for Inferring the Semantics of Tables Varish Vyankatesh Mulwad 8:00am Thursday, 8 January 2015, ITE325b

Tables are an integral part of documents, reports and Web pages in many scientific and technical domains, compactly encoding important information that can be difficult to express in text. Table-like structures outside documents, such as spreadsheets, CSV files, log files and databases, are widely used to represent and share information. However, tables remain beyond the scope of regular text processing systems which often treat them like free text.

This dissertation presents TABEL — a domain independent and extensible framework to infer the semantics of tables and represent them as RDF Linked Data. TABEL captures the intended meaning of a table by mapping header cells to classes, data cell values to existing entities and pair of columns to relations from an given ontology and knowledge base. The core of the framework consists of a module that represents a table as a graphical model to jointly infer the semantics of headers, data cells and relation between headers. We also introduce a novel Semantic Message Passing scheme, which incorporates semantics into message passing, to perform joint inference over the probabilistic graphical model. We also develop and explore a “human-in-the-loop” paradigm, presenting plausible models of user interaction with our framework and its impact on the quality of inferred semantics.

We present techniques that are both extensible and domain agnostic. Our framework supports the addition of preprocessing modules without affecting existing ones, making TABEL extensible. It also allows background knowledge bases to be adapted and changed based on the domains of the tables, thus making it domain independent. We demonstrate the extensibility and domain independence of our techniques by developing an application of TABEL in the healthcare domain. We develop a proof of concept for an application to generate meta-analysis reports automatically, which is built on top of the semantics inferred from tables found in medical literature.

A thorough evaluation with experiments over dataset of tables from the Web and medical research reports presents promising results.

Committee: Drs. Tim Finin (chair), Tim Oates, Anupam Joshi, Yun Peng, Indrajit Bhattacharya (IBM Research) and L. V. Subramaniam (IBM Research)