CSE 512: Distributed Database Systems
[Fall 2020][Spring 2020][Fall 2019][Spring 2019][Fall 2018][Spring 2018][Fall 2017][Fall 2016]

The course touches upon the following main topics: distributed database architectures, distributed data storage and indexing, distributed query processing/optimization, concurrency control in distributed database systems, replicated data management, and reliability in distributed databases. Note that the teaching style in this course is not only lecture-based. Students need to do the necessary reading before class and come prepared to participate in in-class discussions. A group project represents the biggest chunk of this course – where students need to build a complete system prototype that extends the functionality of a distributed data management system in order to support a new application. In the project, students will also write a technical report that describes and experimentally evaluates the built system.


CSE 412: Database Management
[Spring 2020][Spring 2019][Spring 2018]

Database systems are used to provide convenient access to disk-resident data through efficient query processing, indexing structures, concurrency control, and recovery. In general, database systems used to be a place for data to stay in peace and wait for external access. This course is deemed an introduction to the database systems world. More specifically, this course will touch upon the following main topics: database design/modeling, data storage and indexing, query processing/optimization, transaction management, database security, and data analytics. The ultimate goal is to master skills in data modeling and extract information stored in a database using existing database management system (abbr. DBMS). Course participants will learn definitions of terms and understand the DBMS milieu. By the end of this course, students should learn the following: (1) Create an efficient relational data model that fits the database application needs. (2) Create a database, store it on a computer server, and load data into it. (3) Write SQL (i.e., language to access databases) programs to issue queries over the database to read, edit, analyze, and summarize data. (4) Create a backup of your database and safely migrate the database to another computer server.


CSE 511: Data Processing at Scale
Database systems are used to provide convenient access to disk-resident data through efficient query processing, indexing structures, concurrency control, and recovery. This course delves into new frameworks for processing and generating large-scale datasets with parallel and distributed algorithms, covering the design, deployment and use of state-of-the-art data processing systems, which provide scalable access to data. Specific topics covered include: Scalable query processing, Indexing structures, Distributed database design, Parallel query execution, Data management in cloud computing environments, CAP theorem, Data management in Map/Reduce and Apache Spark, and NoSQL database systems


CSE 591: Advanced Data Systems
The course is organized as series of seminars presented by the instructor and students. The instructor will present the state-of-the-art techniques for various advanced database topics. Each student is expected to present two to three papers in a certain topic. Other students are expected to submit a half-page summary that highlights the merits and challenges of the presented papers after attending the seminar. Each student will be asked to choose a certain topic and provide: (1) a survey report that summarizes the state-of-the art techniques of the chosen topic, and (2) a term-long project that can be done in a group of two. The project will involve implementing some of the techniques covered in class with some modifications to them, or performing comparative studies between alternative techniques. A good project would possibly result in writing a publishable paper.