Graph and Cypher for a typical PLM Question
At Ganister, one of our customers reached out to us last week with a question about its data. This is one of our launching customer who knows the power of the graph. They...
Filter by Category
Filter by Author
At Ganister, one of our customers reached out to us last week with a question about its data. This is one of our launching customer who knows the power of the graph. They...
Posted by Yoann Maingon
Last June (June 17th 2021), Neo4j raised $325 millions. Last week ( october 5th 2021), Memgraph raised $9.34 millions. Tigergraph raised $105 millions last winter (February 17th...
Posted by Yoann Maingon
Here is a short article a bit late this week about an interesting video I found about Parametric Shapes assisted with AI. This is based on a scientific paper recently published by...
Posted by Yoann Maingon
Following up on my old article about ETL, another interesting piece of software for a PLM stack is the Enterprise Service Bus. Having a Service Bus in any company department...
Posted by Yoann Maingon
I’m resuming our series of blog post about configuration management concepts. The last one was about non-interchangeable revision. This week we cover the fork concept. Fork...
Posted by Yoann Maingon
I wish you all a happy new year. My blog colleagues all came up with predictions for either the coming years or like oleg and Jos, for 2030. There are some very interesting...
Posted by Yoann Maingon
I just stumbled upon a video from CNBC just yesterday about the rise of Open Source. It gave me a strange reaction at first. Open source is getting bigger every day. NPM packages...
Posted by Yoann Maingon
What is the language your PLM solution has been built with? It is something that barely comes up in PLM evaluation. Does it matter? I think so, but in order to know why it matters...
Posted by Yoann Maingon
PLM solutions are mostly web-based nowadays. To access the main User Interface of these PLM solutions you need a web-browser. Web browers are eating 3 languages: HTML CSS...
Posted by Yoann Maingon
Don't start a PLM project without knowing what an ETL is
Posted by Yoann Maingon
At Ganister, one of our customers reached out to us last week with a question about its data. This is one of our launching customer who knows the power of the graph. They don’t have anyone yet who knows the cypher language to query their own graph database but they know it is very efficient when it comes to turning a question into valuable data.
The company is building systems. It contains a lot of electronic and has a lot of potential integrations on vehicles, buildings, etc. Their product is managed by Families and Systems.
They have about 7300 parts, 124 systems and 35 families. The average depth of a bill of material is 7 levels and it contains from 300 to 600 part occurrences per system.
“Hey Ganister (I believe it could become something real soon with all these personal assistant technologies !) can you tell me which part used by family XXX are not used by any other families?”
Let’s represent a set of a the data
The basic process is:
We end up with the 5 parts on the left side of the graph.
The first idea was to find all the parts connected to the family and add a where clause to filter the ones that have relationships with Families other than XXX.
MATCH (n1:family{_ref:'XXX'})-[:programSystem|systemPart|consumes*]->(p1:part),(n2:family)
WHERE NOT (n2)-[:programSystem|systemPart|consumes*]->(p1:part) AND n2._ref='XXX'
RETURN p1
It looked good and quite simple but it was not very efficient. Efficiency is usually related to the number of database hits required to find the correct answers.
Then we believed it would be easier to list parts from family XXX, parts from other families and diff the two sets of data.
MATCH (n1:family{_ref:'XXX'})-[:programSystem|systemPart|consumes*]->(p1:part)
WITH collect(DISTINCT p1._ref) as P1
MATCH (n2:family)-[:programSystem|systemPart|consumes*]->(p2:part)
WHERE n2._ref <>'XXX'
WITH P1,collect(DISTINCT p2._ref) as P2
UNWIND apoc.coll.subtract(P1, P2) as res
RETURN res
As mentioned we had a first way of doing it which wasn’t very efficient. The first way to figure out it is not efficient is based on the human feeling => “Hummm I don’t think it should take this much time !”. Then, Neo4j provides a nice way to understand the efficiency of the query by prefixing your query with the word PROFILE.
Here is the result of the first cypher query which looked simpler but resulted in almost 48 millions db hits.
The second query which is diffing two sets of results, generates only 700k db hits
This is not a query we have to run many times, therefore we did not spend time improving the performance.
The first great result was customer’s satisfaction to get such precise result within a very small amount of time. The query takes about 300ms to run at first and 200ms the next times because of cache mechanisms from Neo4J. But the main success for us is proving that these types of questions about a customer data can be answered by graph database technologies much better than other types of databases.
I just stumbled upon a video from CNBC just yesterday about the rise of Open Source. It gave me a strange reaction at first. Open source is getting bigger every day. NPM packages...
Following up on my old article about ETL, another interesting piece of software for a PLM stack is the Enterprise Service Bus. Having a Service Bus in any company department...