Lecture 6: CQL 1

Lecture 6: CQL 1#

Gittu George, January 23 2025

Attribution: This notebook is developed using some materials provided by neo4j.

Announcements#

Please start working on assignments early. As I always say, everything you need for the assignment is available on our course website. Daniel (TA) has posted a helper/sample exercise on Piazza to assist you with the last question in Assignment 2.
Mark your attendance in iClicker only if you are present in class. If you encounter any technical issues, please reach out to the in-class TA during worksheet time.
WE ONLY ACCEPT REGRADE REQUESTS as outlined here: https://ggeorg02.github.io/bait580_2024/additional/regrade.html
- All other requests WILL BE IGNORED, including emails to other TAs, in-person requests during class to me or other TAs who graded the question, or comments on SpeedGrader.
- Make sure you submit the request within the given time frame (outlined in the link above).
- Process: Daniel will review your request, and if he deems it valid, he will regrade your entire assignment/worksheet. He will then pass his grading to me for approval or suggestions, and finally, he will respond to your email.
If you want to reach out to me for something please use my email.https://ggeorg02.github.io/bait580_2024/lectures/lecture0.html#instructor. For course related queries, please use Piazza.
- I don’t check messages in Canvas, as I also receive messages from other courses that are not very important.
- I try my best for a 24 hour turn around time, during business days, and this turnaround time is also upheld for the questions you ask on Piazza ( TAs are aware of this, for maximum efficency).
DON’T USE CHATGPT FOR YOUR ASSIGNMENTS (say for writing/reasoning). If you are caught using it, you will be reported to the academic integrity office, and can lead to further investigations, and penalties.
Don’t be shy to raise your hands during worksheet time. We will be happy to help you with any of your queries.

Expectations for CQL and MQL#

We won’t test you on the DDL part.
We will only test you on the DML part.
For the exam, if we ask you to write a CQL/MQL query, we won’t require you to use a specific approach or logic. You can write it in any way you’re comfortable with, as long as the output meets the expected result.

Todays Agenda#

Creating nodes & relationships.
Creating constraints.
Querying nodes and relationships.
Working with patterns in queries.
Class activity:

Learning objectives#

Loading data and creating nodes and relationships
Using CQL to query the graph database

Check these cheat sheets to see topics that we are going to cover as part of our CQL learning.

Creating nodes & relationships#

We learned how to define

Nodes
Labels
Relationships
Relationship types
Properties

Here is the simplified syntax for creating a node:

CREATE (node)

e.g.:

// We are creating nodes
CREATE (git:PERSON {name: 'Gittu', age: 21, profession:'Postdoctoral Fellow'})
CREATE (mik:PERSON {name: 'Mike', age: 21,profession:'Associate Professor'})
CREATE (tom:PERSON {name: 'Tom', age: 24,profession:'Postdoctoral Fellow'})
CREATE (pan:Animal {name: 'Panda',method_of_movement: 'Climb',height:20,mainly_found: 'China'})
CREATE (pol:Animal {name: 'Polar Bear',method_of_movement: 'Walk',height:20,mainly_found: 'Canada'})
CREATE (kan:Animal {name: 'Kangaroo',method_of_movement: 'Hop',height:30,mainly_found: 'Australia'})

// We are creating a relationship that connects these nodes

CREATE (git)-[:LIKE {percent: 100}]->(pan),
(mik)-[:LIKE {percent: 90}]->(pol),
(tom)-[:HATE {percent: 100}]->(pol),
(tom)-[:LIKE {percent: 50}]->(kan),
(pol)-[:LIKE {percent: 100}]->(git)

With this, we finished creating our dummy graph data in the cloud. To see what we have created, we need to query the database.

Let’s start with how to examine the data model,

CALL db.schema.visualization()

For more information:

CALL apoc.meta.data()
YIELD label, property, type, array, other, elementType
RETURN label, property, type, array, other, elementType

Creating constraints#

Use CONSTRAINT ON & ASSERT to create constraints.

Uniqueness constraint - Ensures that a value for a property is unique for all nodes of that type.

CREATE CONSTRAINT ON (n:Person) ASSERT (n.name) IS UNIQUE
CREATE CONSTRAINT ON (n:Movie) ASSERT (n.title) IS UNIQUE

Existence constraint - Ensures that when a node or relationship is created or modified, it must have certain properties set.

CREATE CONSTRAINT ExistsMovieTagline ON (m:Movie) ASSERT exists(m.tagline)

Node key - Ensures that a set of values for properties of a node of a given type is unique.

CREATE CONSTRAINT UniqueNameBornConstraint
ON (p:Person) ASSERT (p.name, p.born) IS NODE KEY

For retrieving constraints defined for the graph

CALL db.constraints()

For dropping constraints

DROP CONSTRAINT constraintname

Inspecting schema#

Let’s inspect the schema for the movie data.

CALL db.schema.visualization()

Includes nodes (entities,objects,things) such as:

Movie
Person

People are related to Movies using these relationships:

DIRECTED
ACTED_IN
WROTE
REVIEWED

Querying nodes and relationships#

MATCH & RETURN#

Use the MATCH to match the nodes you ask for, and RETURN to retrieve the nodes.

Here we retrieve all the nodes.

MATCH (n) // returns all nodes in the graph
RETURN n

As you can remember, (n) here stands for all nodes. Now, what if we want to retrieve only the nodes with Person labels?

MATCH (p:Person) // returns all Person nodes in the graph
RETURN p

You can add whatever nodes you are interested in based on properties. Here we are returning nodes filtered on a property value.

MATCH (m:Movie {released: 2003, tagline: 'Free your mind'})
RETURN m

Until now, we have been returning nodes. If we want to return the properties in those nodes

MATCH (p:Person {born: 1965})
RETURN p.name, p.born

You can specify aliases or column headings using AS

MATCH (p:Person {born: 1965})
RETURN p.name, p.born AS `birth year`

Now let’s add in some relationships as well to return relationships.

MATCH (p:Person)-[rel:ACTED_IN]->(m:Movie {title: 'The Matrix'})
RETURN p, rel, m

We can also have nodes with no variables or anonymous nodes. In the above example, we are interested in only the persons who acted in ‘The Matrix’.

MATCH (p:Person)-[:ACTED_IN]->(:Movie {title: 'The Matrix'})
RETURN p.name

You can also query on multiple relationships

MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN|DIRECTED]->(m:Movie)
RETURN p.name, m.title

We can also have anonymous relationships.

MATCH (p:Person)--(m:Movie {title: 'The Matrix'})
RETURN p, m

MATCH (m:Movie)<--(p:Person {name: 'Tom Hanks'})
RETURN p, m

type() returns the type of a relationship.

MATCH (p:Person)-[rel]->(:Movie {title:'The Matrix'})
RETURN p.name, type(rel)

Let’s try adding some properties for relationships;

MATCH (p:Person)-[:REVIEWED {rating: 65}]->(:Movie {title: 'The Da Vinci Code'})
RETURN p.name

Traversing multiple relationships.

Since we have a graph, we can traverse through nodes to obtain relationships further into the traversal. In addition, you can assign a variable to the path and return the path as follows:

MATCH path = (p:Person)-[:FOLLOWS]->(:Person)-[:FOLLOWS]->(:Person {name:'Jessica Thompson'})
RETURN path

WHERE to Filter Queries#

Just like in SQL, we use WHERE to filter queries.

We already know how to write this query.

MATCH (p:Person)-[:ACTED_IN]->(m:Movie {released: 2008})
RETURN p, m

Can we write this using WHERE?

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released = 2008
RETURN p, m

Important

You can only refer to named nodes or relationships in a WHERE clause, so you must specify a variable for any node or relationship you are testing in the WHERE clause.

If we can write the same thing without using WHERE, then why do we need WHERE?

With the WHERE clause, you can specify potentially complex conditions for the query.

Here are some examples;

Using AND & OR in WHERE clause.

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released = 2008 OR m.released = 2009
RETURN p, m

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released >= 2003 AND m.released <= 2004
RETURN p.name, m.title, m.released

Important

IS NOT NULL, we can use in testing the existence of a property or a pattern. For the opposite of IS NOT NULL we use IS NULL.

Let’s test existance of a property using IS NOT NULL.

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name='Jack Nicholson' AND m.tagline IS NOT NULL
RETURN m.title, m.tagline

Using STARTS WITH in WHERE clause.

MATCH (p:Person)-[:ACTED_IN]->()
WHERE p.name STARTS WITH 'Michael'
RETURN p.name

Let’s use toLower() to convert a property to lower case.

MATCH (p:Person)-[:ACTED_IN]->()
WHERE toLower(p.name) STARTS WITH 'michael'
RETURN p.name

Note

Converting a property to lower case may not be performing well, as if an index has been created for this property, it will not be used at runtime.

Working with Patterns in Queries#

Traversal in a MATCH clause

Understanding how the graph engine performed this traversal can help write an efficient CQL query.

MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie)
WHERE m.title = 'The Replacements'
RETURN follower.name, reviewer.name

Specifying multiple MATCH patterns#

MATCH (a:Person)-[:ACTED_IN]->(m:Movie),
(m)<-[:DIRECTED]-(d:Person)
WHERE m.released = 2000
RETURN a.name, m.title, d.name

Note

When multiple patterns are specified in a MATCH clause, no relationship is traversed more than once.

We can also specify the above query using a single pattern.

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
WHERE m.released = 2000
RETURN a.name, m.title, d.name

Why multiple match patterns?

Even though both queries mentioned above mean the same, there are situations when you can’t represent it in a single match pattern, especially when looking at complex queries. For e.g.:

Suppose we want to retrieve the movies that Meg Ryan acted in and their respective directors and the other actors that acted in these movies. Here is the query to do this:

MATCH (meg:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person),
(other:Person)-[:ACTED_IN]->(m)
WHERE meg.name = 'Meg Ryan'
RETURN m.title as movie, d.name AS director , other.name AS `co-actors`

Note

You can specify aliases or column headings using AS; you will be seeing it a lot in all upcoming queries.

Tip

It’s a good practice to use AS wherever you can, as it can guide your thoughts and print results in a meaningful way.

Multiple MATCH clauses,#

MATCH (valKilmer:Person)-[:ACTED_IN]->(m:Movie)
MATCH (actor:Person)-[:ACTED_IN]->(m)
WHERE valKilmer.name = 'Val Kilmer'
RETURN m.title as movie , actor.name

But we can also write the same using multiple patterns,

MATCH (valKilmer:Person)-[:ACTED_IN]->(m:Movie),
(actor:Person)-[:ACTED_IN]->(m)
WHERE valKilmer.name = 'Val Kilmer'
RETURN m.title as movie , actor.name as `Actor name`

A best practice is to traverse as few nodes as possible so in this example, using multiple MATCH patterns is best.

Important

Usually, the multiple MATCH clauses are used when dealing with Subqueries using WITH. We will discuss it soon.

Specifying varying length paths#

Syntax:

//Retrieve all paths of any length with the relationship, :RELTYPE from nodeA to nodeB and beyond
(nodeA)-[:RELTYPE*]->(nodeB) 
//Retrieve the paths of length 2 with the relationship, :RELTYPE from nodeA to nodeB:
(nodeA)-[:RELTYPE*2]->(nodeB)
// Retrieve the paths of length 3 with the relationship, :RELTYPE from nodeA to nodeB:
(nodeA)-[:RELTYPE*3]->(nodeB)

Experiment with the below query by varying depths, and see how it behaves.

MATCH (follower:Person)-[:FOLLOWS*3]-(p:Person)
WHERE follower.name = 'Paul Blythe'
RETURN p.name

Now what if we want to retrieve the paths of lengths 1, 2, or 3 with the relationship, :RELTYPE from nodeA to nodeB, nodeB to nodeC, as well as, nodeC to _nodeD) (up to three hops):

Syntax:

(nodeA)-[:RELTYPE*1..3]->(nodeB)

MATCH (follower:Person)-[:FOLLOWS*1..3]-(p:Person)
WHERE follower.name = 'Paul Blythe'
RETURN p.name

Finding the shortest path#

This shortestPath() function is very useful in many cases where there are many ways to reach from node A to node B.

MATCH p = shortestPath((m1:Movie)-[*]-(m2:Movie))
WHERE m1.title = 'A Few Good Men' AND
m2.title = 'The Matrix'
RETURN p

Note

Here we specify * for the relationship. This means to use any relationship for the traversal.

Can you?#

Write Cypher statements to:

Retrieve nodes from the graph.
Filter nodes retrieved using property values of nodes.
Retrieve property values from nodes in the graph.
Filter nodes retrieved using relationships.
Specify multiple MATCH patterns.
Specify multiple MATCH clauses.

Write Cypher WHERE clauses to test:

Equality.
Multiple values.
Ranges.
Labels.
Existence of a property.
String values.
Regular expressions.
Patterns in the graph.
Inclusion in a list.

Let’s check out the reference card again to see the covered topics.

Class activity#

Note

Check worksheet 6 in canvas for detailed instructions on activity and submission instructions.

Practice CQL.

Deleting Nodes and Relationships (OPTIONAL)#

MATCH (p:Person)
WHERE p.name = 'Jane Doe'
DELETE p

Tip

The most efficient way to delete a node and its corresponding relationships is to specify DETACH DELETE. When you specify DETACH DELETE for a node, the relationships to and from the node are deleted, then the node is deleted.

Can you try deleting this node? Can you delete it? Why can’t you?

MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DELETE p

MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DETACH DELETE p

More on CREATE (OPTIONAL- Additional information)#

We already saw about creating nodes and relationships. This section shows you that you can create relationships with existing nodes.

To demonstrate, let’s create some nodes.

CREATE (m:Movie:Action {title: ' Batman Begins'})
RETURN m.title

CREATE
(:Person {name: 'Michael Caine', born: 1933}),
(:Person {name: 'Liam Neeson', born: 1952}),
(:Person {name: 'Katie Holmes', born: 1978}),
(:Person {name: 'Benjamin Melniker', born: 1913})

If you want to add more labels to a node, you can query for the node you want to add, and after that, use SET to set the label.

Syntax for SET

SET x:Label // adding one label to node referenced by the variable x
SET x:Label1:Label2 // adding two labels to node referenced by the variable x

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m:Fantasy
RETURN labels(m)

Syntax: Removing labels from a node

// remove the label from the node referenced by the variable x
REMOVE x:Label
// remove the two labels from the node referenced by the variable x
REMOVE x:Label1, x:Label2

MATCH (m:Action)
REMOVE m:Action, m:Fantasy
RETURN labels(m)

Syntax: Adding properties to a node

SET x.propertyName = value
SET x.propertyName1 = value1 , x.propertyName2 = value2
SET x = {propertyName1: value1, propertyName2: value2}
SET x += {propertyName1: value1, propertyName2: value2}

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.released = 2005, m.lengthInMinutes = 140
RETURN m

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m = {title: 'Batman Begins',
released: 2005,
lengthInMinutes: 140,
videoFormat: 'DVD',
grossMillions: 206.5}
RETURN m

Retrieving properties of a node

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
RETURN properties(m)

Updating properties - JSON-style

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m += { grossMillions: 300,
awards: 66}
RETURN m

Syntax: Removing properties from a node

REMOVE x.propertyName
SET x.propertyName = null

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.grossMillions = null
REMOVE m.videoFormat
RETURN m

Creating Relationships

MATCH (a:Person), (m:Movie)
WHERE a.name = 'Katie Holmes' AND m.title = 'Batman Begins'
CREATE (a)-[rel:ACTED_IN {roles: ['Rachel','Rachel Dawes']}->(m)
RETURN a.name, rel, m.title

The SET and REMOVE syntax is the same as what we discussed with nodes.

MATCH (a:Person), (m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
CREATE (a)-[rel:ACTED_IN]->(m)
SET rel.roles = ['Bruce Wayne','Batman']
RETURN a, rel, m

MATCH (a:Person)-[rel:ACTED_IN]->(m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
REMOVE rel.roles
RETURN a, rel, m

Lecture 6: CQL 1

Contents

Lecture 6: CQL 1#

Announcements#

Expectations for CQL and MQL#

Todays Agenda#

Learning objectives#

Creating nodes & relationships#

Creating constraints#

Inspecting schema#

Querying nodes and relationships#

MATCH & RETURN#

WHERE to Filter Queries#

Working with Patterns in Queries#

Specifying multiple MATCH patterns#

Multiple MATCH clauses,#

Specifying varying length paths#

Finding the shortest path#

Can you?#

Class activity#

Deleting Nodes and Relationships (OPTIONAL)#

More on CREATE (OPTIONAL- Additional information)#