ISWC 2017 Trip Report

The 16th International Semantic Web Conference took place from October 21-25 in Vienna, Austria. These are my random thoughts.

First of all, I’m honored to be part of the Organizing Committee as a chair of the In-Use Track, together with Philippe Cudré-Mauroux. Jeff Heflin was the General Chair and a fantastic leader. The conference was impeccable thanks to the AMAZING local organization. Axel Polleres and Elmar Kiesling did an incredible job. I truly enjoyed every step of the process to help organize ISWC 2017. I am really looking forward to ISWC 2018 in Monterey, CA and ISWC 2019 in Auckland, New Zealand!

I was part of a pre-ISWC meeting which a followup with the group that attended the Dagstuhl workshop on Federated Semantic Data Management. We continued defining a list of prioritized research topics.

Frank van Harmelen gave a fantastic keynote at the Semantic Science workshop about the end of the scientific paper.

Btw, if you have the chance to see Frank give a talk… it’s a must! He is one of the best speakers in academia that I have ever seen. I wish I could present like him!

I attended most of VOILA!2017 Workshop. The highlight of the event was the demos. Around 20.

* Next version of VOWL is addressing a lot of needs.
* Check out ViziQuer. It looks cool but I’m skeptical about how usable it is.
* Great to see interest on UIs for generating R2RML mappings but they haven’t been tested yet with real world scenarios. Long ways to go here.
* I need to check out the Linked Data Reactor
* Treevis.net, interesting resource. Need to check it out.
* The extensible Semantic Web Browser
* user, user, user: everybody mentions users but usually the “user” is not defined. Who exactly is your user?

Welcome ceremony was in the Vienna Rathaus. Beautiful place. We were so lucky.

A post shared by Juan Sequeda (@j_sequeda) on

A post shared by Juan Sequeda (@j_sequeda) on

During the welcome ceremony, we had reencounter of 5 alumni from the 2008 Summer School on Ontological Engineering and Semantic Web: Laura Drăgan, Tara Raafat, Maria Maleshkova, Anna Lisa Gentile and myself with John Domingue and Enrico Motta who were the organizers. We have gone a long ways!

Great discussion about the history of Project halo funded by Vulcan with Michael Witbrock, Oscar Corcho and Steffen Staab. Learned a lot of historic details.

Congrats to Mayank Kejriwal for winning the 2017 SWSA Distinguished Dissertation Award! Mayank and I are academic brothers: we both did our PhD at the University of Texas at Austin under the supervision of Prof Daniel Miranker.

Congrats to DBpedia for winning the SWSA Ten-Year Award. Definitely well deserved!

Industry and In use: If I’m not wrong, approximately 25% of attendees of ISWC were from industry and government (more specifically not from academia). All the industry talk were on Monday. Great to see the room full all of the time. We are definitely seeing more use of semantic technologies. However, my observation is that this is mainly government and research/innovation folks are very large companies. It is not yet replacing the status quo. Additionally, a lot of complaints about the lack of maturity of tools, specially open source tools. I’m not surprised.

Ontology engineering seems to be popular (or it never stopped?). Deborah McGuinness‘ keynote showed real world projects in health care where ontologies play a central and vital role. Takeaway message: it takes a village.

It seems to me that we have had the following evolution in the past decade: first focus on hard core theoretical ontologies (DL and the like), second focus has been more on the data side (Linked Data), third focus (now) is about “little semantics goes a long way”. Jim Hendler has always been right (see my comments below on Jamie Taylor’s keynote).

Is this the year of the Knowledge graph? Are Knowledge Graphs becoming Mainstream? Thomson Reuters announced (by coincidence?) their Knowledge Graph while ISWC was going on. There was no formal announcement during the conference.

Interesting part is that Thomson Reuters built their own RDF graph database (triplestore). Why? See this tweet:

I presented a poster on the Ontology and Mapping Engineering Methodology that we have been using at Capsenta in order to apply semantic web technologies to address data integration and business intelligence pain points. THANK YOU THANK YOU THANK YOU for all the feedback that I received during the poster session and hallway conversations. This is the reason why you go to a conference! Special shoutout to Enrico Franconi and Frank van Harmelen. Conversations with you were extremely insightful.

Jamie Taylor, the former Minister of Information at Freebase (and the person who has had one of the coolest titles) and who now manages the Schema Team for Google’s Knowledge Graph gave the third keynote, which btw, was exactly what you expect for a keynote. Thanks Jamie for such an awesome talk!

His message was very clear: we need actionable and executable Ontologies/Knowledge Graphs. What does this actually mean? The example he gave was the following: in the Google KG, they have assertions that Polio and the Vaccine for Polio, but no where it is asserted that the Vaccine for Polio prevents Polio. This goes into adding common sense knowledge (think about Cyc).  I think it would be fair to say that the lessons learned reported by Jamie were a bit “duh”/“told you so” to this community. My observation is that the giant Google, at the end, is doing what the Semantic Web community has been working on for over a decade. This is good! It was very nice to see the evolution of the Knowledge Graph at Google and insightful to see the practical role that semantics take place. Pascal Hitzler quickly wrote up his take away from Jamie’s keynote.

Congrats to Olaf Hartig, Ian Letter and Jorge Perez for winning the Best Paper Award.

This paper presents foundational work towards understanding what are Linked Data Fragments (LDF) and the relationship between different types of LDF. From a more general point of view, this work helps to formalize the relationship between a Client-Server architecture. Hence it’s applicability is not just within the Semantic Web. This is a beautiful example of how theory and practice can be bridged. Additionally, the presentation was simply brilliant. Jorge Perez has the capability of taking the most complicated concepts and presenting them in a way which is understandable and enjoyable to the audience. I can’t wait to see this presentation on video lectures. When it is published, this is a must see on how to present a paper at a conference. I wish I could present like Jorge!

Daniel Garijo presented his WIDOCO tool. If you work with ontologies, you really need to use this tool which basically is an outsource for the documentation of the ontology. He also received the Best Paper Award for the Resource track. Well deserved!

You can find all the papers of the conference for download on the ISWC 2017 website. No paywall!

The Job Fair was a great addition. Looking forward to seeing its evolution in the upcoming ISWC.

I really enjoyed being part of the mentoring session. It’s great to hear students about what worries them and provide some advice. We discussed paper publishing process, academia vs industry, US vs Europe, dealing with loneliness, and many more topics. Please reach out if you have any questions!

Great to have more Austin presence at ISWC with data.world

who also sponsored the … JAM SESSION! All I can say is:

and without further ado, here is 1+ hour video of the Jam Session, a large group of Semantic Web Computer Scientist PhDs jamming, after just 3 hours of practice. I think this is the definition of epic! Enjoy!

 

See you next October in Monterey, California for ISWC 2018!

A Weekend in Antigua, Guatemala

In May 2017, Escape ATX shared a deal for Austin to Guatemala for $300! I immediately jumped on it. Last weekend I visited Guatemala, specifically Antigua. This small town used to be the capital of the Kingdom of Guatemala (which included most of Central America) in the 1700s and is now UNESCO World Heritage Site. After the peace was signed in the mid 90s, Antigua started to boom with a lot of tourist but continued to maintain it’s small town appeal.

For me, the best way to summarize Antigua is the following: imagine a typical pueblo in Latin America (in Colombia think Villa de Leyva o Salento) mixed with the cosmopolitan vibe of Austin. Cobble stone roads, colonial style housing, park in the middle of the town with the cathedral in front, with high end luxury restaurants, bars with pub food, local craft beer, hole in the wall bars.

I observed three types of foreigners:
1) tourists
2) short term: foreigners coming for volunteering or “figuring what I want to do with life” who come for months and may end up staying for a year or two
3) resident immigrants: foreigners who have been living in Antigua for many years and are owners of a bar or restaurant

Antigua is a bubble within Guatemala. It is not cheap (same prices as in Austin). But it has a charm, a “no sé que” that wants me to come back. I can see myself going back and working from Antigua for a week or two (who would be interested?)

These are some of the places that I visited which I recommend:

Chermol: Argentinean restaurant. Wide variety of local craft beers

A post shared by Juan Sequeda (@j_sequeda) on

The Snug: Small irish pub with live music

A post shared by Juan Sequeda (@j_sequeda) on

Panza Verde: high end restaurant, romantic ambiance. All the food was delicious.

A post shared by Juan Sequeda (@j_sequeda) on

Cafe No Se: The famous Cafe No Se. It’s been featured in NY Time’s “What to do in 36 hours in Antigua Guatemala”. It’s a hole in the wall, mostly full of foreigners. In the back they have the a mescal bar where they only serve Illegal mescal and beer. Music is blues/soul which reminds me of Thursday night at Barberella in Austin

A post shared by Juan Sequeda (@j_sequeda) on

Caoba Farms: an organic farm where they have a farmers market every Saturday with local cusine. During my visit they had an Oktoberfest

A post shared by Juan Sequeda (@j_sequeda) on

A post shared by Juan Sequeda (@j_sequeda) on

Cantina Royal: cool bar

A post shared by Juan Sequeda (@j_sequeda) on

Saberico: Eat in a beautiful garden. Breakfast was delicious.

A post shared by Juan Sequeda (@j_sequeda) on

Whiskey Den: Whisky, why not? There are a bunch of other bars next to this one.

A post shared by Juan Sequeda (@j_sequeda) on

Angie Angie: awesome pizza! Live music. Outside patio is relaxing. On Sunday Pizza is 2×1.

A post shared by Juan Sequeda (@j_sequeda) on

A Summer of Computer Science, Research, Semantic Web, Databases, Graphs and Travel!

This has been a summer of Computer Science, Research, Semantic Web, Databases, Graphs and a lot of travel! In these past 4 months, I visited 10 countries and traveled over 72,000 miles; equivalent to going around the world 3 times. Whew! This is the summary of my summer travel. 

Montevideo
I attended the 11th Alberto Mendelezon Workshop on Foundations of Data Management. AMW is a scientific event with with a heavy attendance from database theory researchers. The hallway discussions are very insightful. I was the organizer of the Summer School and a presented a short paper titled “Ontology Based Data Access: Where do the Ontologies and Mappings come from?”I had a lot of enlightening conversations with Luna Dong from Amazon (working on creating the Product Knowledge Graph. Semantic web is involved), Julia Stoyanovich who gave a real thought provoking tutorial on Data Responsibility (we should all pay attention to this), Leonid Libkin (Nulls in databases are still an issue). I was thrilled to finally meet Dan Suciu, James Shanahan, Jan Van den Bussche among many other database gurus. It’s always a pleasure to hang out with the chilean database “mafia”: Marcelo Arenas, Pablo Barcelo, Leo Bertossi, Claudio Gutierrez, Aidan Hogan et al. Congrats to the local team for organizing a wonderful event, specially to Mariano Consens!
Buenos Aires
I flew into Montevideo and I was flying out from Buenos Aires. I got to spend a day and a half in this great city. I truly enjoyed it. I will have to come back! Blog post about my 36 hour visit to Buenos Aires will come soon.
San Francisco
I attended Graph Day where I had two talks “Do I need a Graph Database? If so, what kind?” and “Graph Query Language Task Force Update from LDBC”. My takeaways:
– AWS is figuring out what to do with Graphs

– Uber is creating a Knowledge Graph
– Stardog, was the only RDF graph database company there. They are growing and very direct with their material: if you are doing data integration, you should be using RDF.
– Multi-model databases are growing: Datastaxs, ArangoDB, OrientDB, and Microsoft’s latest release of CosmosDB
– New Graph databases: JanusGraph, Dgraph, AgensGraph
– openCypher is really pushing hard to be THE property graph query language standard

Germany
I attended the Dagstuhl Seminar “Federated Semantic Data Management”, organized by Johann-Christoph Freytag, Olaf Hartig and Maria Esther Vidal. On my way to Dagstuhl, I had the opportunity to stop in Koblenz to hang out with Steffen Staab.
We had extensive discussions on the state of the art in Federated Query Processing from the traditional Relational Databases and Semantic Web perspectives. The goal was to understand the limitations of current approaches in considering ontological knowledge during federated query processing. Federated Semantic Data Management (FSDM) can be summarized in one sentence: Being able to do 1) reasoning/inferencing over 2) unbounded/unknown sources. A couple interesting open challenges to highlight are the following:
1) Unbounded sources: In traditional federated data management, the number of sources is fixed. In FSDM, the number of sources may not be known. Therefore the source selection problem is harder.
2) Correctness: A relaxed version of correctness may need to be considered, a tradeoff between soundness/completeness and precision/recall.
3) Access control: This is still an open challenge even in traditional federated data management.
Switzerland
This is my third home. I try to swing by Zurich once a year. I spent a weekend at Bodensee and visited for the first time Säntis. I had the opportunity to visited Philippe Cudré-Mauroux at the University of Fribourg. We are the ISWC 2017 In-Use PC Chairs, so we had a face-to-face PC meeting. I also gave my talk “Integrating Relational Databases with the Semantic Web: past, present and future” for the first time. This talk is an 1 hour version of my upcoming lecture at the Reasoning Web Summer School in London.
Lisbon
What’s the best way to get from Zurich to London? Stopping for an entire day in Lisbon of course! Specially when you pay for the ticket with miles and $10USD. This was my first time in Lisbon. I arrived early morning, spent 6 hours walking around this amazing city. I also had the chance to have lunch with Sofia Pinto overlooking Lisbon and discuss ontology engineering! One of the best day layovers I have ever had. I have to come back. Blog post on the visit soon.
London
I was invited to be a lecturer of the 13th Reasoning Web Summer School (RW 2017). I delivered a half day lecture on Integrating Relational Databases with the Semantic Web. My lecture notes appear as a book chapter in the book Reasoning Web. Semantic Interoperability on the Web. It was great hanging out with good friends Axel Polleres, Andrea Cali. I finally got to meet for the first time Giorgos Stamou. Great conversations with Domenico Lembo on Ontology Based Data Access and Leo Bertossi on Inconsistent Databases and Data Quality. The highlight of this visit, and of my summer, was the conference dinner at the Royal Society where I sat next to Keith Clark and enjoyed a marvelous dinner speech by Bob Kowalski. Blog post on this event soon.
Toronto
Client work took my all the way to Toronto. First time in Canada! So if it’s hot in Texas, might as well try to spend time in a cooler place. This is a great weekend getaway destination (in the summer): fantastic views, food and beer. I also had the chance to meet up with Mariano Consens and get a tour of the University of Toronto.
Chile
The Graph Query Language task force from the Linked Data Benchmark Council (LDBC) organized a face-to-face week meeting in Santiago, Chile to work on the proposal for a closed graph query languages where paths are first class citizens. A full week of hard work (we also had fun). I took advantage of this visit to visit my UT friends Lindsey Carte, Alvaro Quezada-Hofflinger and Marcelo Somos, professors at the Universidad de La Frontera in Temuco. I gave a talk in spanish “Integrating Data using the Semantic Web: The Constitute Use Case”. It is enjoyable challenge to give talks to non-computer scientists.
Miami
Back in February I found a Austin-Miami roundtrip ticket for $110. So why not! We discovered the Barrel of the Monks brewery in Boca Raton. This is a must if you are in that area and you like belgium beers!
Greece
I was invited to attend the STI Summit in Crete. My first time in Crete, and in Greece (I have never attended ESWC which is usually in Crete). Very intense couple of days talking about the future of Semantic Web research. Afterwards I visited Irini Fundulaki at FORTH and Giorgos Stamou at the National Technical University of Athens where I gave my talk on Integrating Relational Databases with the Semantic Web. I was very impressed with all the work on mappings that has been done in both of these groups. In both cases, the one hour talk turned into hours and hours of fruitful discussions. On my flight to Athens I met a fellow travel geek:72hrJetsettergirl. The next day, we randomly bumped into each other at the Acropolis. The sweet coincidences of life!
Atlanta
I attended the ACM Richard Tapia Celebration of Diversity in Computing. I have been attending this conference for 10 years, since I was a senior in college, all throughout my graduate studies and now has a PhD. This year, I was the Workshop and Panel Chair.  I had the chance to moderated a panel “From Research to Startup” with Rachel Miller from Asana (from theory/crypto research to startup), Kunle Olukotun, (Stanford professor and founder of multiple startups) and Andy Konwinski (PhD from UC Berkley and co-founder Databricks). I also was on another entrepreneur panel with Ayana Howard (Professor at Georgia Tech and founder of Zyrobotics) and Jamika Burge. Both panels had a mix of undergrad, grad students and even faculty interested in learning and entrepreneur experiences. We definitely had an amazing group of panelists. Kemafor Anyanwu Ogan invited me to be on her panel of Data Management for IoT. One of the highlights of the conference is to meet with former and new members of Hispanics in Computing including Manuel Pérez Quiñones (congrats on the Richard A. Tapia Achievement Award for Scientific Scholarship, Civic Science and Diversifying Computing!) and Dan Garcia. We missed you Jose Morales and Patti Ordonez!
Netherlands
I’m writing this post on my way back from Amsterdam. I had the opportunity to meet up with Peter Boncz and talk about Graph Query Language use cases. I also gave my talk “Integrating Relational Databases with the Semantic Web” at the VU Weekly Artificial Intelligence meeting. Great crowd and a lot of great questions. Nice seeing Frank van Harmelen and Javier Fernandez.

The summer is well over. Fall is already in full force in Europe. But it is still feels like summer in Texas.

Is RDF a graph?

A graph consists of a set of vertices (nodes, points) and a set of edges (arcs, lines) between nodes. The common definition is  G = (V, E)  where V representes the set of vertices and E represents the edges between two vertices.

Commercially, there are two specific types of graph data models: Property Graph and RDF Graph.  A property graph is a graph where key-value pairs can be associated to vertices and edges. An RDF graph is a represented as a set of triples: subject, predicate, object where the subject and object are vertices and a predicate is an edge.

However, it seems that Jim Webber, Neo4J’s Chief Scientist does not acknowledge that RDF graphs are graphs:

My response and Jim’s follow up response:

and my response:

It is still unclear to me why Jim Webber believes RDF graphs are not graphs?

Jim, I’m in London this week. I would love to meetup, have a pint and chat about graphs!

Why doesn’t the Database and Semantic Web research community interact more?

I was in Chicago to meet with colleagues from the Graph Query Language task force at the Linked Data Benchmark Council (LDBC) so we could have an impromptu face-to-face meeting (great progress towards our graph query language proposal!). They were in Chicago attending one of the main academic database conferences: SIGMOD/PODS. I was able to take a quick look at papers, demos and tutorials.

I left with the following question: Why doesn’t the Database and Semantic Web research community interact more? The cross pollination, in my opinion, is minimal. It should be much bigger. A couple of examples:

If you go to two conferences of your field in a year, consider swapping one conference to attend another conference in a different field. For example, for the Semantic Web community, if you attend ISWC and ESWC, consider swapping one of those to attend SIGMOD or VLDB. Same for the database community.  VLDB 2017 will be in Munich from August 28th to September 1, 2017.

I made a list of papers from SIGMOD/PODS (research papers, demos and tutorials) that I believe are relevant to the Semantic Web community. The SIGMOD and PODS papers are available online

PODS Papers

SIGMOD Papers

SIGMOD Demos

SIGMOD Tutorials

P.S. For the travel and points geeks. Last minute travel to Chicago was really expensive. Over $500 USD.  I was able to use 25000 miles and pay just $10 USD. And I even got upgraded to first class!

A Refreshing, No-Fluff, No-Buzzword Perspective on Artificial Intelligence

I encountered this refreshing and excellent summary of Artificial Intelligence by John Launchbury, the Director of DARPA’s Information Innovation Office (I2O). Thanks Frank van Harmelen for posting this!

No fluff. No buzzwords. It is crisp and succinct explanation of the state of AI today and where it is going. Deep learning wasn’t even mentioned!

The quickly summary is that AI up to now can be summarize in two waves:

First Wave: Handcrafted Knowledge which is very good at reasoning but not very good a perceiving the outside world. It is not good for learning nor abstracting.

Handcrafted Knowledge: Enables reasoning over narrowly defined problems. No learning capability and poor handling of uncertainty

Second Wave: Statistical Learning which is good a perceiving and learning but it is not so good for reasoning and abstracting.

Statistical Learning: Nuanced classifications and predication capabilities. No contextual capability and minimal reasoning ability.

The next Wave, noted as Contextual adaptation is where systems can construct explanatory models that explain real world phenomena. 

My take away from this is that GOFAI (Good Old Fashion AI) is still active and relevant and by combining it with Machine Learning, we will enter the next wave of AI which can provide answers to the why (context).

The conclusion of this video is aligned with the takeaway message from Jim Hendler presentation at the 4th Heidelberg Laureate Forum (HLF): we need Human and AI together.

Hope you enjoy watching these videos as much as I did.

Smart Data and Graphorum Conference Trip Report

I attended the Smart Data-Graphorum Conference (January 30 – February 1) in the Bay Area (actually Redwood City). This conference series originally was called Semantic Technology (SemTech) Conference and I have been presenting at it since 2010.

This year, the conference had a cozy feeling with ~250 attendees. I gave two talks:

  • Graph Query Languages: Similar to my Graph Data Texas talk, I gave an update from the Graph Query Language task force at the LDBC. The latest discussions were incorporated in this talk. We have been discussing the idea of having a paths as a datatype and also its own table ( a table for Nodes, Edges and Paths). Additionally, there are two notions of projection: relational vs graph. The slides provide some examples. This is still on going work.

  • Virtualizing Relational Databases as Graphs: a multi-model approach: In this talk I discussed how relational databases can be virtualized as RDF Graphs by using the W3C RDB2RDF standards: Direct Mapping and R2RML. I argue that graphs are cool, and ask if relational databases are cool? If you are  deciding to move from a relational database to a graph database, you should understand the tipping point. I believe virtualization is a viable option to keep your data in a relational database while continuing to take advantage of graph features. However, that may not always be the case.

 

Additional highlights of the conference

  • I was glad to see a lot of friendly faces. I feel very lucky to that I can always have a chat with Deborah McGuinness and Michael Uschold, two legends in ontologies. It’s always great to see Souri Das from Oracle (and all the Oracle folks from the semantic technology group) and discuss how the W3C RDB2RDF standards are doing. We both agree that we did a good job with that standard and gave a pat on our own backs 🙂 Also great to see Peter Haase, Dean Allemang, Atanas Kiryakov, Bart van Leeuwen, Jans Aasman, Dave McComb and many more.
  • Michael Uschold and I discussed the pragmatics of part-of and has-label semantics. For some situations you want to be generic. For example, it’s easier for a user to just use “has label” for any thing, instead of having to know the exact type of “has label” for a specific thing. Now I understand many of the modeling decisions made in gist. I argue that from a database point of view, query performance is better if you have more specific properties, unless you have some sort of semantic query optimizations.
  • Cambridge Semantics gave a presentation on their in-memory analytics graph database. They presented results using the LUBM benchmark where they claim to have blown Oracle away. Important to note that they used 4x the hardware. Atanas Kiryakov, Ontotext’s CEO was in the audience and rightfully asked why they didn’t use a more up to date benchmark given that LUBM is from 2007. It seems that everybody has been using LUBM (since 2007) so in order to compare to others, they continue to use LUBM. Hopefully they will start using the LDBC benchmarks!
  • I have been aware that Marklogic markets themselves as a document and graph database. I now understand how they represent things underneath the hood. Each entity, with their corresponding attributes and values are represented in a document (key-values). The relationships between the entities are represented as RDF triples.  This makes a lot of sense to me and I can imagine how this can improve query performance to a certain degree.
  • Brian Sletten gave a great talk on JSON-LD. I wish all web developers could see this presentation in order to understand the value of Linked Data. Even though Brian was not able to give his talk on the new W3C upcoming standard SHACL, the Shapes Constraint Language, his slides left a lasting impression. This is the best definition I have ever seen for the Open World Assumption!

  • It was great to see Emil Eifren, Neo Technologies’ CEO again. We discussed history of RDF and Semantic Web (I didn’t know he was a very early user of Jena!). We seem to be in agreement that RDF is great technology for data integration. Anything else graph related, he argues that you should use Neo4J. Not surprising 😛 I was also glad to see that Neo4j is starting to work on formalizing the semantics of Cypher, including making it a closed query language.

This was a great couple of days and hopefully next year we will have more people!

A Data Weekend in Austin

On the weekend of January 14-15, I attended Data Day Texas, Graph Day Texas and Data Day Health in Austin and gave three talks.

Do I need a Graph Database: This talk came out of a Q/A during a happy hour after a talk I gave at a meetup in Seattle. We were discussing when to use a Graph Database? What type of graphs should you use: RDF or Property Graph.

 

Graph Query Languages: This talk gave an update on the work we have been doing in the Graph Query Language (GQL) task force at the Linked Data Benchmark Council (LDBC). The purpose of the GQL task force is to study query languages specifically for the Property Graph data model because there is a need for a standard syntax and semantics of a query language. One of the main points I was arguing in this talk is the need of a closed language: graphs in, graphs out. One can argue that a reason for success of relational databases is because the query language is closed (tables in, tables out). With this principle, queries can be composed (i.e. views!). This talk was well received and generated a lot of interesting discussion, specially when Emil Eifrem, Neo Technologies’ CEO is in the room.  An interesting part of the discussion was if we are too early for standardization. Emil stated that we need standardization now because their clients are asking for it. I stated that graph databases today are in the mid 1980’s of relational databases, so time is about right to start the discussion. Andrew Donoho said I was too optimistic. He thinks we are in the late 70s and we are too early. I will be giving this talk next week at the Smart DataGraphorum conference, with some updated material. Special thanks to Marcelo Arenas, Renzo Angles and specially Hannes Voigt for helping me organize these slides.

Semantic Search Applied to Healthcare: In this talk, I introduced how we are identifying patients who are in need of Left Ventricular Assist Devices (LVADs) using Ultrawrap, the semantic data virtualization technology developed at Capsenta. This talk presented a use case with the Ohio State University Wexner Medical Center. Patients are being missed through traditional chart pull methods. Our approach has resulted in ~20% increase in detection over previously known population at OSU, which is a mature institution. This talk will also be given at the Smart Data conference.

Main highlights of the conference:

  • Emil Eifrem, CEO of Neo Technology gave the keynote. It was nice to learn the use cases where Neo4j is being used: Real-time recommendation, Fraud detection, Network and IT operations, Master Data Management, Graph-Based Search and Identity & Access Management. It was not clear why were graphs specifically used because these are use cases that have been around for a long time and have been addressed using traditional technologies. Emil ended talking about a “connected enterprise”, meaning integrating data across silos using graphs. If you take a look at my Do I need a graph database talk,  you will see that I argue to use RDF for data integration, not Property Graphs.
  • Luca Garulli, the founder and CEO of OrientDB gave a talk focusing on the need of a multi model database like OrientDB. In his talk, he argued for many features which Neo4J apparently didn’t support. Not long after, there was a good back-and-forth twitter discussion between Emil and Luca. Emil was correcting Luca. Seems like this talk may need to be updated. An interesting take away for me: how do you benchmark a multi model database?
  • Many talks about “I’m in relational, how do I get to property graphs”. All of them at an introductory level. Given that we have studied very well the problem of relational to RDF, this should be a problem that can be address quickly and efficiently.
  • Standards was a big topic, one of the reasons my Graph Query Language talk was well received. Neo4j is pushing for OpenCypher to become the standard, while in fact, one could argue that Gremlin is already the defacto standard. Before this weekend, I wasn’t aware of anybody implementing OpenCypher. Apparently there are now 10 OpenCypher implementation including Bitnine, Oracle and SAP HANA.
  • Bitnine: they are implementing a PropertyGraph DB on top of Postgres and using OpenCypher as the query language. They are NOT translating OpenCypher to SQL. Instead, they are doing the translation to relational algebra internally. I enjoyed the brief discussion with Kisung Kim, Bitnine’s CTO. Apparently they have already benchmarked with LDBC and did very well. Looking forward to seeing public results. Bitnine is open source.
  • Take a look at sql2gremlin.com
  • grakn.ai looks interesting. Need to take a closer look.
  • Cray extended the LUBM benchmark and added a social network for the students.
  • Property Graphs is what comes to mind when people thing about graph databases. However, an interesting observation is that the senior folks in the room prefer RDF than Property Graphs. We all agreed that RDF is more mature than Property Graph databases.
  • “Those who do not learn history are doomed to repeat it.” It is crucial to understand what has been done in the past in order to not re-invent the wheel. I feel lucky that early on in grad school, my advisor pushed me to read pre-pdf papers. It was great to meet this weekend with folks like Darrel Woelk and Misty Nodine who used to be part of MCC. A lot of the technologies we are seeing today has roots back to MCC. For example, we discussed how similar graph databases are to object oriented databases. On twitter, Emil seemed to disagree with me. Nevertheless we had an interesting twitter discussion.
  • Check out JanusGraph, a graph database, which if I understood correctly, is  a fork from Titan. Titan hasn’t been updated in over a year because the folks behind it are now at DataStax.

Thanks to Lynn Bender and co. for organizing such an awesome event! Can’t wait for it to happen in Austin next year. Recordings of the talks will start to show up on the Global Data Geek youtube channel.

Starting a blog!

One of my 2017 resolutions is to start writing again. I’m hoping a blog will help me achieve this goal.

I want to share thoughts about my geeky interests: computer science, research, semantic web, databases, semantic web, etc; and my non-geeky interests such as travel, miles and points, cheap flights, beer and wine 🙂 .

This should be an interesting smorgasbord of content!