W3C Graph Data Workshop Trip Report

This week, March 4-6 2019, was the W3C Graph Data Workshop – Creating Bridges: RDF, Property Graph and SQL

When I come to meetings/workshops like this, I always ask myself what does success look like: “IF X THEN this will have been a successful meeting”. So, I told myself:

 

IF there is a consensus within the community that we need to standardize mappings between Property Graphs and RDF Graphs THEN this will have been a successful meeting.  

 

I can report that, per my definition, this was a successful meeting! It actually surpassed my expectations.

 

In order to keep track of the main outcomes of each talk/session I attend, I’m following a technique to summarize immediately the takeaways in a crisp and succinct manner (and if I can’t that means I didn’t understand). What better way of doing that than in a tweet (or two or three). Therefore the majority of this trip report are pointers to my tweets 🙂. In a nutshell the tl;dr:
– There is a unified and vibrant graph community.
– A W3C Business Group will be formed and serve as a liaison between different interested parties.
– There is a push for RDF*/SPARQL* to be a W3C Member submission.
– There is interest to standardize a Property Graph data model with a schema.
– There is interest to standardize mappings between Property Graphs and RDF.

 

Kudos to Dave Raggett and Alastair Green for chairing this event. The organization was fantastic. Additionally, the website has all the position papers, lighting talk slides and minutes in google docs for every single session. Please go there directly to get all the detailed information directly from the source.

Brad Bebee’s Keynote

The workshop started with a keynote by Brad Bebee from Amazon Neptune. The main takeaway of his talk was:

 

We all know that the common uses cases for graph are: social networks, recommendations, fraud detection, life science and network & IT operations. In addition to the common use cases, Brad said something that highly resonated with me specifically w.r.t. Knowledge Graphs (paraphrasing):

 

“Use graphs to link information together to transform the business. Link things that were never connected before. This is really exciting.

 

Some other important takeaways

Coexistence or Competition

After discussions about how standardization works within W3C and ISO, there was a mini panel session on “Coexistence or Competition” with Olaf Hartig, Alastair Green and Peter Eisentraut. The take aways:

Lightning talks

The day ended with over 25 lightning talks. The moderators were excellent time keepers. The two main themes that caught my attention were the following:

Many independent bridges are already being formed: Many approaches are being presented that build bridges between Property Graphs, RDF Graphs and SQL. A few of the lightning talks:

 

However, as Olaf Hartig was alluding, we should not focus on creating ad-hoc implementations of bridges. We need to clearly understand what that bridge means (i.e. what are the semantics!). Olaf’s RDF*/SPARQL* proposal to annotate statements in RDF and which can serve as a bridge between Property Graphs in RDF has been very well received in the community. As a matter of a fact, this approach has already been implemented in commercial systems such as Cambridge Semantics and Blazegraph.

 

Personally, I avoid (and actually stop) discussion on syntax. In my opinion, that should not be the first topic of discussion. We first need to agree on the meaning.
Note: I think there may be interesting science in here.

 

GraphQL is popular: I was surprised to see GraphQL being a constant topic of discussion. It was presented as the global layer over heterogeneous data sources (i.e OBDA), as an interface to RDF graphs, and also as a schema language for Property Graphs. You could hear a lot of GraphQL discussions in the hallway.
Note: I think this is engineering. Not clear if there is science here.

 

The second day consisted of three simultaneous tracks: Interoperation, Problems & Opportunities and Standards Evolution for a total of 12 sessions. By coincidence (?), all the sessions I was interested were in the Interoperation track.

Graph Data Interchange

Graph Query Interoperation

Specifying a Standard

I find it very cool that Filip Murlak and colleagues defined a formal, readable, and executable semantics of Cypher in Prolog which is based on the formal semantics defined by the folks from U. of Edinburgh. This reminds me when I took a course with JC Browne on Verification and Validation of Software Systems and learned about Tony Hoare’s and Jay Misra’s Verification Grand Challenge. 

Finally, Andy Seaborne made a very important point:

Graph Schema

I was glad to have the opportunity to moderate this session because this is a topic very dear to me (Hello Gra.fo!) and I am chairing an informal Property Graph Schema Working Group (PGSWG), so I was glad to moderate it.

 

Inspired by our work in G-CORE, which was a very nice mix of industry and academia members, and which influenced the GQL manifesto which lead to the GQL standardization effort, I was asked to chair this informal working group. I was able to share what we have accomplished up to now

George Fletcher provided a quick overview of the lessons learned in the academic survey. He condensed the lessons learned into: 1) start small, 2) start from foundations and 3) start with flexibility in mind. Oskar van Rest presented an overview of what the existing industry-based Graph databases support. This is still work in progress. I presented the use case and requirements document which is that starting point to drive the discussions towards features that address concrete use cases. Olaf presented how GraphQL could be a schema language for PG, in other words there is a syntax that could be reused. This sparked the discussion of syntax, syntax, syntax. As I previously mentioned, I avoid discussions that jump immediately into syntax because we should first focus on the understanding/semantics.

 

The top desirable feature was … KEEP IT SIMPLE! Other top features were: enable for future extensibility, allow for permissive vs restrictive, allow for open world vs closed world, have a simple clean formalization, and again… keep it simple (don’t make mistakes like XML Schema). Josh Shinavier remotely mentioned “historically, property graphs were somewhat of a reaction to the complexity of RDF. A complex standard will not be accepted by the developer community.

 

To summarize our 1.5 hour discussion:

 Finally, 1.5 hours is not enough to discuss graph schemas so a group of us stayed the next day and kept working on it.

Up to now, the PGSWG has been informal. There was a consensus that it should gain some sort of formality by becoming a task force within the Linked Data Benchmark Council (LDBC). More info soon!

What are the next steps?

The goal of the third and final day was to first offer a summary of each session and then to discuss the concrete next steps.
My concrete proposal for next steps:

I was also proposing to standardize mappings from Relational Database to Property Graphs and I was happy to learn that this work is already underway within ISO.

 

Following the building bridges analogy, we need to have aligned piers in order to know how to build the bridge. RDF is standardized and formalized. Property Graphs are not. Therefore the first task is to lift the Property Graph pier so it can be aligned to RDF. Subsequently, we will be in the position to start addressing interoperability needs between Property Graphs and RDF Graphs by the means of establishing direct and customizable mappings.

 

Furthermore, given the commercial uptake and interest of RDF*/SPARQL*, this will drive discussions towards a new version of RDF in the very near future.

 

The official outcome (I believe) is that a W3C Business Group will be created in order to coordinate with all the interested parties, existing W3C community groups and be a liaison with ISO (where the GQL and SQL/PG work is going on). An official report will come soon.

Lack of Diversity

We have a vibrant graph community. However this community lacks diversity as it was observed on twitter by Margaret who wasn’t even at the event:

There was definitely over 100 people attending this meeting. 96+ were men. I believe there was only 5 4 females attending (thanks to Christophe for the clarification). I had the chance to meet with them.

 

Dörthe Arndt: A moderator for the Rules and Reasoning session, a researcher at Ghent University and who believes that rules should be part of data. Unfortunately I did not have the opportunity to speak more with Dörthe.

– Marlène Hildebrand: This is the first time I met Marlène. She is at EPFL working on data integration using RDF, so we discussed a lot about converting different sources to RDF, mappings and methodologies on how to create ontologies and mappings.

– Petra Selmer is a member of the Query Languages Standards and Research Group at Neo4j and has vast experience on graph databases.
Monika Solanki, well known in the Semantic Web community and always a pleasure to interact with her at conferences.
– Natasa Varytimou: It was great to finally meet Natasa in real life after interacting a lot via email. She is an Information Architect at Refinitiv (Finance company of Thomson Reuters) and is one of the brains behind the large scale Refinitiv Knowledge Graph.
The lack of diversity worries me and I strongly urge that we, as a community, take action on this matter.

Final quick notes

– We seem to be converging into a unified graph community! Not individual RDF and PG communities. I didn’t hear any RDF vs PG conversations.
– However, Gremlin was underrepresented. If it weren’t for Josh Shinavier, who was constantly providing his input remotely, we would have missed valuable input.
– Thank you Josh and Uber for offering a virtual connection. I believe everything has been recorded and you can find the details in the minutes.
– BMW is starting to get onboard the Knowledge Graph bandwagon. After chatting with Daniel Alvarez , it seems that they are still in an early innovator phase. Nevertheless, very exciting.
– It was a great idea to have a two day event across three days. That  way you could technically arrive on the first day of the event and leave on the last day of the event.
– The W3C RDB2RDF Standard editors meet again! I was one of the editors of the Direct Mapping while Richard Cyganiak was one of the editors of R2RML

– Adrian Gschwend has his summary in a twitter thread:

– Gautier Poupeau has his summary in a twitter thread in french:

– Find a lot more tweets by searching for the the #W3CGraphWorkshop hashtag.

International Semantic Web Conference (ISWC) 2018 Trip Report

ISWC has been my go-to conference every year. This time it was very special for two reasons. First of all, it was my 10 year anniversary of attending ISWC (first one was ISWC2008 in Karlsruhe where I presented a poster that ultimately became the basis of my PhD research and also the foundational software of Capsenta). Too bad my dear friend and partner in crime, Olaf Hartig, missed out (but for good reasons!). I only missed ISWC2010 in Shanghai; other than that, I’ve attended each one and I plan to continue attending them (New Zealand next year!)

The other reason why this was a special ISWC is because we officially launched Gra.fo, a visual, collaborative, real-time ontology and knowledge graph schema editor, which we have been working on for over 2 years in stealth mode.


THE Workshop: This year at ISWC, I co-organized THE Workshop on Open Problems and Emerging New Topics in Semantic Web Technology. The goal was to organize a true workshop where attendees would actually discuss and get work done.

Let’s say that we may have been a bit ambitious but at the end it turned out very well. In the first part of the morning, everybody was encouraged to stand up and on the spot talk for a minute about their problem. We gathered 19 topics. The rest of the morning, we self organized into clusters and each group continued to discuss and finalized with a wrap-up.

The goal was to submit the problems to THE Workshop website. Looks like the attendees have not done their homework (you know who you are!). We had great feedback about this format and we will consider submitting it again for next year and improve the format.


VOILA: I’ve been attending the Visualization and Interaction for Ontologies and Linked Data (VOILA) Workshop for the past couple of years (guess why 🙂 ) and luckily I was able to catch the last part of it. My take away is that there is a lot of cool things going on in this area but the research problems that are being addressed are not always clear. Furthermore, prototypes are engineered and evaluated but it’s not clear who is this tool for. Who is your user? I brought this up in my trip report from last year. This community MUST partner with other researchers in HCI and Social Science in order to harden the scientific rigor. Additionally, there are cool ideas that would be interesting to see if there is commercial viability.


SHACL:  I attended the Validating RDF data tutorial by Jose Emilio Labra Gayo. I came in trying to find an answer to the following question: Is SHACL ready for industry prime time? The answer is complicated but unfortunately I have to say, not yet. First of all, even though SHACL is the W3C recommendation, there is another proposal called ShEx from Jose Emilio’s group. He acknowledges his bias but if you look at the ShEx and SHACL side by side, you can argue for one or the other objectively. For example, ShEx supports recursive constraints, but SHACL doesn’t (There was a research paper on this topic, Semantics and Validation of Recursive SHACL, … but it’s research!). Nevertheless, the current SHACL specification is stable and technically ready to be used in prime time. The problem is the lack of commercial tools for enterprise data. Jose Emilio is keeping a list of SHACL/ShEx implementations but all except for TopQuadrant, are (academic) prototypes. Seems like Stardog is planning to officially support it in their 6.0 release. At this stage, I was expecting to see a standalone SHACL validator that can take as input RDF data or a SPARQL endpoint and run the validations. With all due respect, but these kind of situations are embarrassing for this community and industry: apparently a standard is needed, a recommendation is made, but at the end there is no industry implementation and uptake (one or two is not enough). We live in a “build it and they will come” world and this does not make us look good. </rant>. On a positive note, I think we are very close to the following: create a SHACL to SPARQL translator that starts out by supporting a simple profile of SHACL (cardinality constraints). This way anybody can use this on any RDF graph database. Somebody should build this, and we should support it as a community, not just academics but also having industry users behind it.

Hat tip to Jose Emilio for the nice SHACL/ShEx Playground and to EricIovka and Dimitris for making their book, Validating RDF, available for free (html version).


SOLID: I missed out on the Decentralizing the Semantic Web workshop. I heard it was packed and I guess it did help that Tim Berners-Lee was there presenting on Solid. Later on, I had the chance to talk to TimBL about Solid and his new startup Inrupt. The way I understood Solid and what Inrupt is doing is through the following analogy: They have designed a brand new phone and the app store infrastructure around it (i.e. Solid). However, people already have phones (web apps that store your data) so they need to convince others to use their phone. Who would they convince and how? Ideally, they want to convince everybody on earth… literally, but they can start out with people who are concerned about data ownership privacy. My skepticism is that the majority of the people in the world don’t care about it. Jennifer Golbeck’s keynote touched on this topic and stated that young people don’t care about privacy but the older you get, the more you start caring. Solid is definitely solving a problem but I question the size of the market (i.e. who cares about this problem). Good luck Inrupt team!

Enterprise Knowledge Graphs: One of the highlights of ISWC was the Enterprise Knowledge Graph panel. This was actually a great panel (commonly I find that panels are very boring). The participants were from Microsoft, Facebook, Ebay, Google and IBM. I had two main takeaways.
1) For all of these large companies, the biggest challenge is identify resolution. Decades of Record Linkage/Entity Resolution/etc research and we are still far away from solving this problem… at scale. Context is the main issue.
2) The most important takeaway from the entire conference was: NONE OF THESE COMPANIES USE RDF/OWL/SPARQL… AND IT DOESN’T MATTER! I was actually very happy to hear them say this in front of the entire semantic web academic community. At the end, the ideas of linking data, using triples, having tight/loose schemas, reasoning, all at scale have come out of the semantic web research community and started to permeate into the industry. It’s fine if they are not using the exact W3C Semantic Web Standards. The important thing is that the ideas are being passed on to the real world. It’s time to listen to the real world and see what problems they have and bring it back for research. This is part of the scientific method!
Notes from each panelist:

Another possible answer to Yolanda Gil’s question is the recently launched dataCommons.org.
The final question to the panel: what are the challenges that the scientific community should be working on. Their answers:


Not everybody is a Google: The challenges stated by the Enterprise Knowledge Graph panelist are for the Googles of the world. Not everybody is a Google. For a while now, I feel that a large research focus is on tackling problems for the Googles of the world. But what about the other spectrum? My company Capsenta is building knowledge graphs for very large companies and I can tell you that building a beautiful, clean knowledge graph from even a single structured data source, let alone a dozen, is not easy. I believe that the semantic web, and even the database community have forgotten about this problem and dismissed this as day to day engineering challenges. The talk “Integrating Semantic Web in the Real World: A Journey between Two Cities” that I have been giving this year details all the open engineering, scientific and social challenges we are encountering. One of those problems is defining mappings from source to target schemas. Even though the Ontology Matching workshop and the Ontology Alignment Evaluation Initiative have been going on for over a decade… the research results and systems do not address the real world problems that we see at Capsenta in our day to day. We need to research the real world social-technical phenomenons of data integration. One example is dealing with complex mappings. I was very excited to see the work of Wright State University and their best resource paper nominated work “A Complex Alignment Benchmark: GeoLink Dataset”. This is off to a good start but there is still a lot of work to be done. Definitely a couple of PhDs can come out of this.


Natasha Noy’s keynote:  I really enjoyed her keynote, which I summarized: 

She also provided some insight on Google Dataset search:


Vanessa Evers’ keynote was incredible refreshing, because it successfully brought to the attention of the semantic web community the problems encounter to create social intelligent robots. Guess what’s missing? Semantics and reasoning!


Industry:  I was happily surprised to see a lot of industry folks this year. The session I chaired had about 100 people.

Throughout the week I saw and met with startups like Diffbot and Kobai; folks from Finance: FINRA, Moodys, Federal Reserve, Intuit, Bloomberg, Thomson Reuters/Refinitiv, Credit Suisse; Graph Databases companies: Amazon Neptune, Allegrograph, Marklogic, Ontotext’s GraphDB, Stardog; Healthcare: Montefiore Health Systems, Babylon Health, Numedii; the big companies: Google, Microsoft, IBM, Facebook, Ebay; and many others such as Pinterest, Springer, Elsevier, Expert Systems, Electronic Arts. Great to see so much industry attending ISWC! All the Industry papers are available online.

Best Papers: The best papers highlighted the theme of the conference: Knowledge Graphs and Real World relevance. The best paper went to an approach to provide explanations of facts in a Knowledge Graph.

The best student research paper was a theoretical paper on canonicalisation of monotone SPARQL queries, which has a clear real world usage: improve caching for SPARQL endpoints.

The best resource paper address the problem of creating a gold standard data set for data linking, a crucial task to create Knowledge Graphs at scale. They present an open source software framework to build Games with a Purpose in order to help create a gold standard of data by motivating users through fun incentives.

The best in use paper went to the paper that describes the usage of semantic technology underpinning Wikidata, the Wikipedia Knowledge Graph.

Finally, the best poster went to VoCaLS: Describing Streams on the Web and the Best demo award went to WebVOWL Editor.


DL: Seems like DL this year meant Deep Learning and not Description Logic. I don’t think there was any paper on Description Logic, a big switch from past years.


Students and Mentoring:  I enjoyed hanging out with PhD students and offering advice at the career panel during the Doctoral Consortium and at the mentoring lunch.

During the lunch on Wednesday we talked about science being a social process and it was very nice that this also came up on Thursday during Natasha’s keynote


Striving for Gender Equality: I am extremely proud of the semantic web research community because they are an example of always striving for gender equality. This year they had a powerful statement: conference was organized entirely by women (plus Denny and Rafael) and  they had 3 amazing women keynotes. Additionally, the local organizers did a tremendous job!

Furthermore, Ada Lovelace Day, which is held every year on the second Tuesday of October, occurred during ISWC. So what did the organizers do? They held the Ada Lovelace celebration where we had a fantastic panel discussing efforts on striving for gender equality in the sciences (check out sciencestories.io!)

The event ended with Wikipedia Edit-a-thon where we created and edited Wikipedia pages of female scientist. In particular, we created Wikipedia pages for female scientist in our community: Natasha Noy, Yolanda Gil, Lora Aroyo. It was a true honor to have the opportunity to create the english wikipedia page of Asunción Gómez Pérez, who has been incredibly influential in my life.

More trip reportsCheck out Helena Deus’ and Paul Groth’s ISWC Trip reports (which I haven’t read so it wouldn’t bias mine)

What an awesome research community: I am very lucky to consider the Semantic Web community my research home. It’s a work hard, play hard community.

We were at a very beautiful venue:

We like to sing

We like to have great dinners and dance:

We even throw jam sessions and parties:

And just like last year, I recorded the Jam session:

https://www.facebook.com/juansequeda/videos/10112285265566200/
https://www.facebook.com/juansequeda/videos/10112285339752530/
https://www.facebook.com/juansequeda/videos/10112285428858960/

See you next year in New Zealand

… and then in 2020 … Athens, Greece!

Trip Report on Knowledge Graph Dagstuhl Seminar

It’s always an honor to be invited to a Dagstuhl Seminar (this was my third!). I was extremely lucky to have been invited to a seminar on Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web . The goal of Dagstuhl seminars is to get a group of really smart people together, drink beer and wine, and discuss about the future of a particular research area.

I went to this seminar with the following agenda: 1) share my new research interest on studying social-technical phenomena of data integration and see what others think, 2) learn what others think are the hard scientific challenges with Knowledge Graphs and most importantly 3) understand what is the big AHA moment with Knowledge Graphs. After spending the last few days processing the entire week, I feel that I achieved all the goals of my agenda.

Given that I wear two hats, industry and academia, let me summarize my takeaways w.r.t these two hats

Industry Hat

What is a Knowledge Graph?: There was no consensus from the beginning. Claudio Gutierrez presented historical definitions knowledge graph. I had no idea that the word Knowledge Graph shows up in the PhD dissertation of René Ronald Bakker… in 1987!

Knowlede Graph Prehistory
Knowledge Graphs 20 years ago

Throughout the week, there was a philosophical and political discussion about the definition. Some academics wanted to come up with a precise definition. Others wanted it to be loose. Academics will inevitably come up with different (conflicting) definitions (they already have) but given that industry uptake of Knowledge Graphs is already taking place, we shouldn’t do anything that hinder the current industry uptake. For example, we don’t want people searching for “Knowledge Graphs” and finding a bunch of papers, problems, etc instead of real world solutions (this is what happened with Semantic Web). A definition should be open and inclusive. A definition I liked from Aidan Hogan and Antoine Zimmermann was “as a graph of data with the intention to encode knowledge”

FAIR Data and Best Practices: I believe that the notion of Findable Accessible Interoperable Reusable (FAIR) is brilliant! This doesn’t apply just for life science or research data. Every enterprise should strive to have FAIR data. Within life science, Barend Mons is pushing that the best practice to implement FAIR data is to create Knowledge Graphs in RDF. I agree because you want to share and reuse vocabularies, have unique identifiers and this is the core of RDF. However, we are missing a set of Best Practices; not just on how to create FAIR data but how to create, manage, maintain Knowledge Graphs in general.

Wikidata: I enjoyed chatting with Lydia Pintscher and learning how Wikidata really works. It is an interesting mix of tech and community. For example, users can create classes and instances but cannot create properties. If they believe they need to do it, it has to go through a community process.

Knowledge Graphs in the real world: We have all been hearing about the popular ones gaining press recently (Google, Amazon, Uber, Airbnb, Thomson Reuters) but I started to hear about others that I wasn’t aware about such as Siemens, Elsevier, Zalando, ING, etc. We need to start compiling a list of them. It was also great to hear from Dezhao Song on how the Thomson Reuters Knowledge Graph was created (spoiler alert: it’s a typical NLP pipeline). For more info, check out their paper “Building and Querying an Enterprise Knowledge Graph“.

Research Hat

The first two days were spent on discussing interest that we had in common and the challenges. I believe there were a total of 15 discussions going on. The topics were on: NLP, Graph Analytics, ML, Decentralized Web, Reasoning and Semantics, Constrained Access, DBpedia/Wikidata, Human and Social Factors, Data Integration, Evolution, What is a KG, Best Practices and more. There will be an official report on all these discussions so stay tuned!

My AHA Moment: Tuesday evening at the wine cellar we had a late (late) night discussion and some of us believed that the discussion up to now were definitely interesting but could be considered the natural, incremental, next steps. There was a lack of boldness and thinking outside of the box. What are the true grand challenges? Our late night discussion helped drive the atmosphere on Wednesday in order to focus on being bold.

I was extremely lucky to participate in a group discussion on Wednesday with Piero Bonatti, Frank van Harmelen, Valentina Presutti, Heiko Paulheim, Maria Esther Vidal, Sarven Capadisli, Roberto Navigli and Gerald de Melo. We started asking ourselves a philosophical question: What is the object of our studies? This question sparked fascinating discussions. I felt like a philosopher because we were questioning what we, as scientist, were actually researching. We are making observations about which natural phenomena? We are devising theories about what object? I’m happy to say that we did come to a conclusion.

In my own words: Knowledge Representation & Reasoning and Knowledge Engineering are fields that study the object of knowledge. Data management is a field that studies the object of data. Each of these fields have had independently advances on understanding this object and how it scales (where scales means the typical Vs: volume, variety, etc). Furthermore, efforts to study the relationship between these objects can be traced back to 1960s. However, what we observe now is a new object: knowledge and data at scale. I would like to be bold and state that studying the phenomena of knowledge and data at scale is it’s own field in Computer Science. Knowledge Graphs are a manifestation of this phenomena.

This was my AHA moment. After we shared this with the rest of the group, it sparked a lot of discussion, including the political and philosophical aspects of defining a Knowledge Graph.

Human and Social factors: Succinctly, my current research interest is in researching the socio-technical phenomena the occurs in data integration. With my industry hat on, I get to work with large enterprises on data integration. Throughout this work I observe the work we do with my research lenses and observe situations (phenomena) that I do not know how to deal with. For example, creating mappings is a not a problem that can be addressed easily with a technical/software solution. Schemas are extremely large and complex and you need specific knowledge (i.e. legal, business) that only specific users know, and they may not even agree. I had many interesting discussions with Valentina Pressutti, Paul Groth, Marta Sabou, Elena Simperl and I’m glad to realize that my research interest has merit and is shared with others. There is a lot of work to be done, specially because we, as computer scientist, need to interact with other communities such as cognitive science, HCI, etc. I’m very excited about this topic because it gets my out of my comfort zone.

Multilinguality: I have never followed this topic so it was great to learn about it. We were lucky to have Roberto Navigli part of the crew. I understood the true challenge of multilinguality of knowledge graphs when Roberto talked about the cultural aspects. For example, how do you define Ikigai in a Knowledge Graph and how do you link it to another entities?

Evolution: how do you represent evolutions of a Knowledge Graph? How do you reason with the evolutions?

I’m very grateful to be part of this research community and I look forward to all the outcomes. Exciting times!

Some random stuff

I wrote this report without looking at Eva Blomqvist ’s and Paul Groth’s trip reports. I didn’t want to bias mine. Now I can read them.

A comment from Frank: It shouldn’t be called Computer Science. It should be called Computing Science because it’s a science that studies the phenomena of computing (not computers because that is the tool), the same way Astronomers don’t study telescopes, they use them as a tool.

I spent an extra day at Dagstuhl. It was nice to relax and reflect on the week. Additionally I spent almost 4 hours in the library and found a lot of gems: First volumes of JACM and a lot of cool books signed by the authors, including Edsger W. Dijkstra. I also found the book of my PhD dissertation in the library

I believe there was a consensus that the best wine at Dagstuhl was Château de Caraguilhes Prestige Corbières 2015.

This is a community with amazing scientist but also amazing musicians! We were lucky that Dagstuhl has a music room with a grand piano and guitars. Can’t wait for the ISWC Jam Session

Trip Report: 1st U.S. Semantic Technologies Symposium (#US2TS)

I attended the 1st U.S. Semantic Technologies Symposium (#US2TS), hosted by Wright State University in Dayton, Ohio on March 1-2, 2018. The goal of this meeting was to bring together the U.S. community interested in Semantic Technologies. I was extremely happy to see 120 people get together in Dayton, Ohio to discuss semantics for 2 days. I’m glad to see such a vibrant community in the U.S. … and not just academics. Actually, I would say that academics were the minorities. I saw a lot of familiar faces and met a lot of people from different areas.

The program was organized around the following topics: Cross Cutting Technologies, Publishing and Retrieving, Space and Time and Life Sciences. Each topic had a set of panelists. Each panelist gave a 10 minute talk. There was plenty of time for discussion and a break out session. It was a very lively. The program can be found here: http://us2ts.org/posts/program/

I gave a 10 min version of my talk “Integrating Relational Databases with the Semantic Web: a journey between two cities“. The takeaway message: in order to use semantic technologies to address the data challenges of business intelligence and data integrate, we need to fulfill the role of the Knowledge Engineer and empowered them with new tools and methodologies. Looks like I did a good job at it and it was well received 😃

Two main topics: Ontologies and Tools

Complexity and Usability of ontologies was a topic throughout the two days. Hallway talk is that light semantics is enough (happily surprised to hear this). However, Life Science and Spatial domain need heavyweight semantics (more below). CIDOC-CRM is the ontology used in the museum domain. Apparently very complicated. A lot of people don’t like it but they have to use it.

Linked Open USABLE Data (LOUD): We need to find a balance between usable and complexity.

I was part of a breakout session on ontologies and reuse. I really appreciated Peter Fox’s comment on ontologies (paraphrasing): there are three sides that we need to take into account 1) expressivity, 2) maintainability and 3) evolvability

I shared our pay-as-you-go methodology to create ontologies and mappings in a poster and in hallway discussions. It was well received.

Tools Tools TOOLS: we need better tools. That was another theme of the meeting. There seemed to be an agreement with my claim that the existing tools are made for the semantic city.

JSON-LD came up a lot. People love it.

Application Areas of Semantics

As expected, Life science was present at this meeting. Melissa Haendel from Oregon Health & Science University showed some really cool results that were possible thanks to semantics. Chris Mungall from Lawrence Berkeley National Laboratory gave an overview of the Gene Ontology.

Really interesting to learn that data in the geography domain (spatial data) is complex and requires heavyweight semantics, just like in life science.

Interesting observations about humanities data. I see the need for semantics

I need to check out perio.do: “A gazetteer of period definitions for linking and visualizing data“. One of the project leads is a fellow longhorn, Prof. Adam Rabinowitz. I want to meet him!

Meeting people

Great chatting with Varish Mulwad from GE Research and learning about all the semantic work that is going on at GE Research. Need to check out Semtk (Semantics Toolkit ) and these papers: .

SemTK: An Ontology-first, Open Source Semantic Toolkit for Managing and Querying Knowledge Graphs

Integrated access to big data polystores through a knowledge-driven framework

I enjoy meeting Alessandro Oltramari and learning about the semantic work going on at Bosch.

Great to finally meet Vinh Nguyen. Her PhD was on Contextualized Knowledge Graphs (I should take a look at her PhD dissertation) and she is now organizing an ISWC 2018 workshop on this topic.

Happy Birthday Craig Knoblock!! He gave a fantastic keynote on his birthday!

Glad to have bumped into Ora Lassila. It’s been a long time!!

Future research directions

Take aways from the Meeting

This is an event that was missing in the U.S. I’m glad that it was organized (Fantastic job Pascal and Krzysztof!). Looking forward to this event next year!

My Most Memorable Event of 2017

I travelled a lot in 2017. The most I have ever traveled before. I flew 163,195 miles which is equivalent to 6.6x around Earth. I was on 114 flights. I spent almost 400 hours (~16 days) on a plane. I visited 17 countries: Austria, Argentina, Canada, Chile, Colombia, France, Germany, Greece, Guatemala, India, Mexico, Netherlands, Portugal, Spain, Switzerland, UK, Uruguay. I was in Austin for 164 days (my home), 61 days in Europe, 28 days in Colombia and 16 days in India. I slept 30 nights at a Marriott, 27 nights at an Airbnb and 13 nights on a plane.

Given all this travel, I asked myself: what was my most memorable event of 2017?

The answer was simple: dinner at the Royal Society of London with Bob Kowalski and Keith Clark.

In July, I gave a lecture at the 2017 Reasoning Web Summer School and attended the RuleML+RR 2017 Conference. The conference dinner was at the Royal Society of London. Bob Kowalski gave the dinner speech titled “Logic and AI – The Last 50 Years”. It was the 50th anniversary of when he started his PhD, which gave the rise to logic programming. Additionally, by pure coincidence I sat next to Keith Clark. The combination of sitting next to Keith Clark and listening to Bob Kowalski’s is what made this my most memorable event of 2017

Why?

Early during my PhD, my advisor, Dan Miranker, encouraged me to read about the 5th Generation Japanese Project (if you don’t know what this is, go look it up NOW!) During my research, in order to trace back the relationship between Logic and Data, I encountered the landmark 1977 Workshop of Logic and Data Bases organized by Herve Gallaire, Jack Minker and Jean-Marie Nicolas. That workshop is where Ray Reiter presented his paper on Closed World Assumption, Bob Kowalski presented his paper on Logic for Data Description and Keith Clark presented his paper on Negation as Failure. I even have a copy of the proceedings:

 

Every time I give a talk on this topic, I reference that 1977 workshop to provide historical context of where we are today. See slide 4:

[slideshare id=83502645&doc=integratingsemanticwebintherealworld-ajourneybetweentwocities-kcap2017keynote-171206180601&slide=4]

Bob concluded with two open questions:

1) What is the relationship between declarative and imperative representation of knowledge?
2) What is the relationship between different types of rules?

As you can imagine, sitting next to Keith Clark, listening Bob Kowalski’s talk and having the opportunity to chat with them is what made this a truly amazing evening.

With Bob Kowalski
With Keith Clark

What an evening! An evening I will never forget! Thank you Bob and Keith!

Oh, I even saw Alan Turing’s Certificate of a Candidate for Election into the Royal Society.

 

Mind blown!

A Weekend in Antigua, Guatemala

In May 2017, Escape ATX shared a deal for Austin to Guatemala for $300! I immediately jumped on it. Last weekend I visited Guatemala, specifically Antigua. This small town used to be the capital of the Kingdom of Guatemala (which included most of Central America) in the 1700s and is now UNESCO World Heritage Site. After the peace was signed in the mid 90s, Antigua started to boom with a lot of tourist but continued to maintain it’s small town appeal.

For me, the best way to summarize Antigua is the following: imagine a typical pueblo in Latin America (in Colombia think Villa de Leyva o Salento) mixed with the cosmopolitan vibe of Austin. Cobble stone roads, colonial style housing, park in the middle of the town with the cathedral in front, with high end luxury restaurants, bars with pub food, local craft beer, hole in the wall bars.

I observed three types of foreigners:
1) tourists
2) short term: foreigners coming for volunteering or “figuring what I want to do with life” who come for months and may end up staying for a year or two
3) resident immigrants: foreigners who have been living in Antigua for many years and are owners of a bar or restaurant

Antigua is a bubble within Guatemala. It is not cheap (same prices as in Austin). But it has a charm, a “no sé que” that wants me to come back. I can see myself going back and working from Antigua for a week or two (who would be interested?)

These are some of the places that I visited which I recommend:

Chermol: Argentinean restaurant. Wide variety of local craft beers

https://www.instagram.com/p/BZ65dlOnc8M/

The Snug: Small irish pub with live music

https://www.instagram.com/p/BaKC9-Bnkub/

Panza Verde: high end restaurant, romantic ambiance. All the food was delicious.

https://www.instagram.com/p/BZ9GAy6njBP/

Cafe No Se: The famous Cafe No Se. It’s been featured in NY Time’s “What to do in 36 hours in Antigua Guatemala”. It’s a hole in the wall, mostly full of foreigners. In the back they have the a mescal bar where they only serve Illegal mescal and beer. Music is blues/soul which reminds me of Thursday night at Barberella in Austin

https://www.instagram.com/p/BZ8c3YSnRgN/

Caoba Farms: an organic farm where they have a farmers market every Saturday with local cusine. During my visit they had an Oktoberfest

https://www.instagram.com/p/BZ9dOiKHxh9/

https://www.instagram.com/p/BZ9dbOlH0VM/

Cantina Royal: cool bar

https://www.instagram.com/p/BZ-qxVBHLCd/

Saberico: Eat in a beautiful garden. Breakfast was delicious.

https://www.instagram.com/p/BaAyxHXHfhg/

Whiskey Den: Whisky, why not? There are a bunch of other bars next to this one.

https://www.instagram.com/p/BaAy1MvHLLQ/

Angie Angie: awesome pizza! Live music. Outside patio is relaxing. On Sunday Pizza is 2×1.

https://www.instagram.com/p/BaKDxkmnFzp/