Lecture 0: Course Introduction
Contents
Lecture 0: Course Introduction#
Gittu George, January 3 2023
Teaching squad#
Instructor#
I am Gittu George, Ph.D
I am a Postdoctoral Fellow.
Email Me: ggeorg02@cs.ubc.ca
Office Hours: Tue 2 -3 pm
Research interests are at the intersection of computer science and genomics.
I primarily teach school of Computer Science and Master of Data Science students.
I re-developed this course based on the syllabus from Winter 2020.
Teaching Assistants#
Daniel Ramandi: I’m a PhD student, working in a Neuroscience lab in the department of Psychiatry, looking at Neural correlates of behavior. I use machine learning to model brain activity and behavior. This is my third year TAing this course and can’t wait to meet everyone. I’m originally from Iran, and love swimming, cooking and Vancouver! :)) |
|
Ngoc Bui: |
|
Olivia Garland: I’m a Masters of Bioinformatics student in my second year. My research is in the field of drug discovery using deep learning. |
|
Shanny Lu: |
|
Vishal Desh: |
|
Zac Warham: I am a second year Masters of Science student studying how seals navigate in the open ocean. I have a background in marine biology and web development and moved to Canada from Australia in 2021. |
Todays Agenda#
Course Overview
Data management in a big data environment
What is big data?
Which tool to use?
How big is big data?
Introduction to cloud computing
Course Overview#
My Goals for the Course#
To think critically about databases as part of an analytic workflow
Learn how to design, use and understand the inner working of the SQL based databases
Taking you from level zero to intermediate with the NoSQL databases (document and graph-based databases)
To work with the data to find the tools best suited to answering the questions you pose
What this course is not about#
SQL or python programming
Cloud computing
Course plan#
Date |
Topic |
Assessments due |
---|---|---|
January 3 |
Introduction to Big Data & cloud computing |
|
January 5 |
Introduction to RDS and interaction with AWS |
|
January 10 |
Faster SQL (Indexing) |
|
January 12 |
(de)Normalization & Data Warehousing |
|
January 17 |
Introduction to NoSQL and Graph Databases |
|
January 19 |
Querying Graph Databases (Part 1) |
|
January 24 |
Querying Graph Databases (Part 2) |
|
January 26 |
Document Databases Intro |
|
January 31 |
Querying Document Databases |
|
February 2 |
Class Conclusions/ Special Topics |
Course Model#
Individual Assignments (50 %)#
Assignment 1 (16 %)#
Introduce you to AWS, working with Postgres in a Jupyter notebook.
Think about data in the context of a research problem.
Setup your AWS account.
Launch your database in AWS.
Use of Database dumps to setup your database.
Apply knowledge in indexing and warehousing to efficiently answer your questions in SQL.
Assignment 2 (17 %)#
An introduction to graph databases using the initial Twitter data.
Using the graph to answer questions about networks of interaction.
Producing interactive plots to represent knowledge.
Practice on CQL.
Assignment 3 (17 %)#
An introduction to Document databases.
Practice on MQL.
Worksheets (10 %)#
Every lecture will have a worksheet (in addition to the assignments) that will help you to prepare for your assignments and practice what you learned in class.
Conclusion#
Ask yourself If you are comfortable with the course logistics.
Join piazza if you haven’t done so. Link from Canvas or from announcements. If you have questions related to the Lecture, logistics, and assignments are expected to be asked in the piazza. Make sure you attach the labels correctly so that we can distinguish questions.
If there is anything else, please feel free to reach out to me at ggeorg02@cs.ubc.ca
I will be releasing lecture notes before Monday morining so that you can look into those before coming to class.
Check for the deadlines in canvas.
Join iclicker cloud.
Make sure you did necessary installations