Lecture 6: CQL 1#

Gittu George, January 25 2024

Attribution: This notebook is developed using some materials provided by neo4j.

Todays Agenda#

  • Creating nodes & relationships.

  • Creating constraints.

  • Querying nodes and relationships.

  • Working with patterns in queries.

  • Class activity:

Learning objectives#

  • Loading data and creating nodes and relationships

  • Using CQL to query the graph database

Check these cheat sheets to see topics that we are going to cover as part of our CQL learning.

Creating nodes & relationships#

We learned how to define

  • Nodes

  • Labels

  • Relationships

  • Relationship types

  • Properties

Here is the simplified syntax for creating a node:

CREATE (node)

e.g.:

// We are creating nodes
CREATE (git:PERSON {name: 'Gittu', age: 21, profession:'Postdoctoral Fellow'})
CREATE (mik:PERSON {name: 'Mike', age: 21,profession:'Associate Professor'})
CREATE (tom:PERSON {name: 'Tom', age: 24,profession:'Postdoctoral Fellow'})
CREATE (pan:Animal {name: 'Panda',method_of_movement: 'Climb',height:20,mainly_found: 'China'})
CREATE (pol:Animal {name: 'Polar Bear',method_of_movement: 'Walk',height:20,mainly_found: 'Canada'})
CREATE (kan:Animal {name: 'Kangaroo',method_of_movement: 'Hop',height:30,mainly_found: 'Australia'})

// We are creating a relationship that connects these nodes

CREATE (git)-[:LIKE {percent: 100}]->(pan),
(mik)-[:LIKE {percent: 90}]->(pol),
(tom)-[:HATE {percent: 100}]->(pol),
(tom)-[:LIKE {percent: 50}]->(kan),
(pol)-[:LIKE {percent: 100}]->(git)

With this, we finished creating our dummy graph data in the cloud. To see what we have created, we need to query the database.

Let’s start with how to examine the data model,

CALL db.schema.visualization()

For more information:

Call apoc.meta.data()

Creating constraints#

Use CONSTRAINT ON & ASSERT to create constraints.

  • Uniqueness constraint - Ensures that a value for a property is unique for all nodes of that type.

CREATE CONSTRAINT ON (n:Person) ASSERT (n.name) IS UNIQUE
CREATE CONSTRAINT ON (n:Movie) ASSERT (n.title) IS UNIQUE
  • Existence constraint - Ensures that when a node or relationship is created or modified, it must have certain properties set.

CREATE CONSTRAINT ExistsMovieTagline ON (m:Movie) ASSERT exists(m.tagline)
  • Node key - Ensures that a set of values for properties of a node of a given type is unique.

CREATE CONSTRAINT UniqueNameBornConstraint
ON (p:Person) ASSERT (p.name, p.born) IS NODE KEY

For retrieving constraints defined for the graph

CALL db.constraints()

For dropping constraints

DROP CONSTRAINT constraintname

Inspecting schema#

Let’s inspect the schema for the movie data.

CALL db.schema.visualization()

Includes nodes (entities,objects,things) such as:

  • Movie

  • Person

People are related to Movies using these relationships:

  • DIRECTED

  • ACTED_IN

  • WROTE

  • REVIEWED

Querying nodes and relationships#

MATCH & RETURN#

Use the MATCH to match the nodes you ask for, and RETURN to retrieve the nodes.

Here we retrieve all the nodes.

MATCH (n) // returns all nodes in the graph
RETURN n

As you can remember, (n) here stands for all nodes. Now, what if we want to retrieve only the nodes with Person labels?

MATCH (p:Person) // returns all Person nodes in the graph
RETURN p

You can add whatever nodes you are interested in based on properties. Here we are returning nodes filtered on a property value.

MATCH (m:Movie {released: 2003, tagline: 'Free your mind'})
RETURN m

Until now, we have been returning nodes. If we want to return the properties in those nodes

MATCH (p:Person {born: 1965})
RETURN p.name, p.born

You can specify aliases or column headings using AS

MATCH (p:Person {born: 1965})
RETURN p.name, p.born AS `birth year`

Now let’s add in some relationships as well to return relationships.

MATCH (p:Person)-[rel:ACTED_IN]->(m:Movie {title: 'The Matrix'})
RETURN p, rel, m

We can also have nodes with no variables or anonymous nodes. In the above example, we are interested in only the persons who acted in ‘The Matrix’.

MATCH (p:Person)-[:ACTED_IN]->(:Movie {title: 'The Matrix'})
RETURN p.name

You can also query on multiple relationships

MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN|DIRECTED]->(m:Movie)
RETURN p.name, m.title

We can also have anonymous relationships.

MATCH (p:Person)--(m:Movie {title: 'The Matrix'})
RETURN p, m
MATCH (m:Movie)<--(p:Person {name: 'Tom Hanks'})
RETURN p, m

type() returns the type of a relationship.

MATCH (p:Person)-[rel]->(:Movie {title:'The Matrix'})
RETURN p.name, type(rel)

Let’s try adding some properties for relationships;

MATCH (p:Person)-[:REVIEWED {rating: 65}]->(:Movie {title: 'The Da Vinci Code'})
RETURN p.name

Traversing multiple relationships.

Since we have a graph, we can traverse through nodes to obtain relationships further into the traversal. In addition, you can assign a variable to the path and return the path as follows:

MATCH path = (p:Person)-[:FOLLOWS]->(:Person)-[:FOLLOWS]->(:Person {name:'Jessica Thompson'})
RETURN path

WHERE to Filter Queries#

Just like in SQL, we use WHERE to filter queries.

We already know how to write this query.

MATCH (p:Person)-[:ACTED_IN]->(m:Movie {released: 2008})
RETURN p, m

Can we write this using WHERE?

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released = 2008
RETURN p, m

Important

You can only refer to named nodes or relationships in a WHERE clause, so you must specify a variable for any node or relationship you are testing in the WHERE clause.

If we can write the same thing without using WHERE, then why do we need WHERE?

With the WHERE clause, you can specify potentially complex conditions for the query.

Here are some examples;

Using AND & OR in WHERE clause.

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released = 2008 OR m.released = 2009
RETURN p, m
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released >= 2003 AND m.released <= 2004
RETURN p.name, m.title, m.released

Important

IS NOT NULL, we can use in testing the existence of a property or a pattern. For the opposite of IS NOT NULL we use IS NULL.

Let’s test existance of a property using IS NOT NULL.

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name='Jack Nicholson' AND m.tagline IS NOT NULL
RETURN m.title, m.tagline

Using STARTS WITH in WHERE clause.

MATCH (p:Person)-[:ACTED_IN]->()
WHERE p.name STARTS WITH 'Michael'
RETURN p.name

Let’s use toLower() to convert a property to lower case.

MATCH (p:Person)-[:ACTED_IN]->()
WHERE toLower(p.name) STARTS WITH 'michael'
RETURN p.name

Note

Converting a property to lower case may not be performing well, as if an index has been created for this property, it will not be used at runtime.

See also

There are other string matching operators like ENDS WITH, CONTAINS

Here is an example where we want to return all Person nodes of people who wrote movies:

MATCH (p:Person)-[:WROTE]->(m:Movie)
RETURN p.name, m.title

Next, we modify this query to exclude people who directed that particular movie: (Let’s test non existance of a pattern using NOT exists())

MATCH (p:Person)-[:WROTE]->(m:Movie)
WHERE NOT exists( (p)-[:DIRECTED]->(m) )
RETURN p.name, m.title

Here is another example where we want to find Gene Hackman and the movies he acted in with another person who also directed the movie: ((Let’s test existance of a pattern using exists()))

MATCH (gene:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(other:Person)
WHERE gene.name= 'Gene Hackman'
AND exists( (other)-[:DIRECTED]->(m) )
RETURN gene, other, m

In this example, we only want to retrieve Person nodes of people born in 1965 or 1970: (Using IN)

MATCH (p:Person)
WHERE p.born IN [1965, 1970]
RETURN p.name as name, p.born as yearBorn

In the following example you are using IN to see if there is ‘Neo’ in all the roles in ‘ACTED_IN’ relationship. (WHERE ‘Neo’ IN r.roles)

MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE 'Neo' IN r.roles AND m.title='The Matrix'
RETURN p.name

iclicker 1#

Working with Patterns in Queries#

  • Traversal in a MATCH clause

We already understand this query from the last class, but understanding how the graph engine performed this traversal can help write an efficient CQL query.

MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie)
WHERE m.title = 'The Replacements'
RETURN follower.name, reviewer.name

Specifying multiple MATCH patterns#

MATCH (a:Person)-[:ACTED_IN]->(m:Movie),
(m)<-[:DIRECTED]-(d:Person)
WHERE m.released = 2000
RETURN a.name, m.title, d.name

Note

When multiple patterns are specified in a MATCH clause, no relationship is traversed more than once.

We can also specify the above query using a single pattern.

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
WHERE m.released = 2000
RETURN a.name, m.title, d.name
  • Why multiple match patterns?

Even though both queries mentioned above mean the same, there are situations when you can’t represent it in a single match pattern, especially when looking at complex queries. For e.g.:

Suppose we want to retrieve the movies that Meg Ryan acted in and their respective directors and the other actors that acted in these movies. Here is the query to do this:

MATCH (meg:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person),
(other:Person)-[:ACTED_IN]->(m)
WHERE meg.name = 'Meg Ryan'
RETURN m.title as movie, d.name AS director , other.name AS `co-actors`

Note

You can specify aliases or column headings using AS; you will be seeing it a lot in all upcoming queries.

Tip

It’s a good practice to use AS wherever you can, as it can guide your thoughts and print results in a meaningful way.

Multiple MATCH clauses,#

MATCH (valKilmer:Person)-[:ACTED_IN]->(m:Movie)
MATCH (actor:Person)-[:ACTED_IN]->(m)
WHERE valKilmer.name = 'Val Kilmer'
RETURN m.title as movie , actor.name

But we can also write the same using multiple patterns,

MATCH (valKilmer:Person)-[:ACTED_IN]->(m:Movie),
(actor:Person)-[:ACTED_IN]->(m)
WHERE valKilmer.name = 'Val Kilmer'
RETURN m.title as movie , actor.name as `Actor name`

A best practice is to traverse as few nodes as possible so in this example, using multiple MATCH patterns is best.

Important

Usually, the multiple MATCH clauses are used when dealing with Subqueries using WITH. We will discuss it soon.

Specifying varying length paths#

Syntax:

//Retrieve all paths of any length with the relationship, :RELTYPE from nodeA to nodeB and beyond
(nodeA)-[:RELTYPE*]->(nodeB) 
//Retrieve the paths of length 2 with the relationship, :RELTYPE from nodeA to nodeB:
(nodeA)-[:RELTYPE*2]->(nodeB)
// Retrieve the paths of length 3 with the relationship, :RELTYPE from nodeA to nodeB:
(nodeA)-[:RELTYPE*3]->(nodeB)

Experiment with the below query by varying depths, and see how it behaves.

MATCH (follower:Person)-[:FOLLOWS*3]-(p:Person)
WHERE follower.name = 'Paul Blythe'
RETURN p.name

Now what if we want to retrieve the paths of lengths 1, 2, or 3 with the relationship, :RELTYPE from nodeA to nodeB, nodeB to nodeC, as well as, nodeC to _nodeD) (up to three hops):

Syntax:

(nodeA)-[:RELTYPE*1..3]->(nodeB)
MATCH (follower:Person)-[:FOLLOWS*1..3]-(p:Person)
WHERE follower.name = 'Paul Blythe'
RETURN p.name

Finding the shortest path#

This shortestPath() function is very useful in many cases where there are many ways to reach from node A to node B.

MATCH p = shortestPath((m1:Movie)-[*]-(m2:Movie))
WHERE m1.title = 'A Few Good Men' AND
m2.title = 'The Matrix'
RETURN p

Note

Here we specify * for the relationship. This means to use any relationship for the traversal.

Can you?#

Write Cypher statements to:

  • Retrieve nodes from the graph.

  • Filter nodes retrieved using property values of nodes.

  • Retrieve property values from nodes in the graph.

  • Filter nodes retrieved using relationships.

  • Specify multiple MATCH patterns.

  • Specify multiple MATCH clauses.

Write Cypher WHERE clauses to test:

  • Equality.

  • Multiple values.

  • Ranges.

  • Labels.

  • Existence of a property.

  • String values.

  • Regular expressions.

  • Patterns in the graph.

  • Inclusion in a list.

Let’s check out the reference card again to see the covered topics.

Class activity#

Note

Check worksheet 6 in canvas for detailed instructions on activity and submission instructions.

  • Practice CQL.

Deleting Nodes and Relationships (OPTIONAL)#

MATCH (p:Person)
WHERE p.name = 'Jane Doe'
DELETE p

Tip

The most efficient way to delete a node and its corresponding relationships is to specify DETACH DELETE. When you specify DETACH DELETE for a node, the relationships to and from the node are deleted, then the node is deleted.

Can you try deleting this node? Can you delete it? Why can’t you?

MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DELETE p
MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DETACH DELETE p

More on CREATE (OPTIONAL- Additional information)#

We already saw about creating nodes and relationships. This section shows you that you can create relationships with existing nodes.

To demonstrate, let’s create some nodes.

CREATE (m:Movie:Action {title: ' Batman Begins'})
RETURN m.title

CREATE
(:Person {name: 'Michael Caine', born: 1933}),
(:Person {name: 'Liam Neeson', born: 1952}),
(:Person {name: 'Katie Holmes', born: 1978}),
(:Person {name: 'Benjamin Melniker', born: 1913})

If you want to add more labels to a node, you can query for the node you want to add, and after that, use SET to set the label.

Syntax for SET

SET x:Label // adding one label to node referenced by the variable x
SET x:Label1:Label2 // adding two labels to node referenced by the variable x
MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m:Fantasy
RETURN labels(m)

Syntax: Removing labels from a node

// remove the label from the node referenced by the variable x
REMOVE x:Label
// remove the two labels from the node referenced by the variable x
REMOVE x:Label1, x:Label2
MATCH (m:Action)
REMOVE m:Action, m:Fantasy
RETURN labels(m)

Syntax: Adding properties to a node

SET x.propertyName = value
SET x.propertyName1 = value1 , x.propertyName2 = value2
SET x = {propertyName1: value1, propertyName2: value2}
SET x += {propertyName1: value1, propertyName2: value2}
MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.released = 2005, m.lengthInMinutes = 140
RETURN m
MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m = {title: 'Batman Begins',
released: 2005,
lengthInMinutes: 140,
videoFormat: 'DVD',
grossMillions: 206.5}
RETURN m

Retrieving properties of a node

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
RETURN properties(m)

Updating properties - JSON-style

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m += { grossMillions: 300,
awards: 66}
RETURN m

Syntax: Removing properties from a node

REMOVE x.propertyName
SET x.propertyName = null
MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.grossMillions = null
REMOVE m.videoFormat
RETURN m

Creating Relationships

MATCH (a:Person), (m:Movie)
WHERE a.name = 'Katie Holmes' AND m.title = 'Batman Begins'
CREATE (a)-[rel:ACTED_IN {roles: ['Rachel','Rachel Dawes']}->(m)
RETURN a.name, rel, m.title

The SET and REMOVE syntax is the same as what we discussed with nodes.

MATCH (a:Person), (m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
CREATE (a)-[rel:ACTED_IN]->(m)
SET rel.roles = ['Bruce Wayne','Batman']
RETURN a, rel, m
MATCH (a:Person)-[rel:ACTED_IN]->(m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
REMOVE rel.roles
RETURN a, rel, m