Final Project

Due Date for Part 1: Friday, March 31, by 5:00pm CDT

Due Date for Parts 2-4: Wednesday, April 26, by 11:59pm CDT

Only in Data

PART 1: Pitch

Form groups of two or three students to work collaboratively on the final project. You may choose your own groups, or you may ask the instructors to assign you to a group. Please let us know what groups you are in, or if you would like to be assigned to a group ASAP.

The first part of the final project is to identify an interesting data set that you want to work on. The data set should be Engineering focused, broadly defined. The data set should also be amenable to CRUD operations and some sort of analysis (e.g. plotting some values or generating summary statistics). Most data sets that are in a list-of-dictionaries format and contain time stamps and/or quantitative values should work. The Meteorite Landing data or ISS Positional data are good examples of this, but should not be used for the final project. There are several links at the bottom of this page to Engineering-focused data sets, but you may look elsewhere too.

Once your group has identified a potential data set to work on, write up a ~1 page summary of the proposed title of your project, list of group members, and a description of the data. Then schedule a ~10 minute meeting with at least one of the instructors in order to “pitch” your project. We want to know what the source is of the data, see what the data looks like, and hear what is your plan for working on the data.

PART 2: Code Repository

The final project will involve building a REST API for interacting with an interesting Engineering data set. The API should allow users to perform basic CRUD operations - Create, Read, Update, Delete - to load in the data, view it in a RESTful / collections style, and delete the data. The API should also allow users to submit an analysis job, which is then offloaded to workers who generate images and store them in a database. The application must be hosted on the Kubernetes cluster and accessible to the outside world at a unique, public URL.

Here are some specific requirements we will look for:

  • The front-end REST API must have:

    • a /help endpoint describing all the routes within

    • appropriate endpoints (and methods) to post / get / delete the Engineering data set

    • sufficient REST-style endpoints for returning the data collections (and subsets therein) in an intuitive way

    • endpoints for submitting a job to plot data and retrieve the results

  • The back-end workers must:

    • use queue functionality to watch for jobs being submitted by the user

    • generate a plot of some aspect of the data set as guided by the user’s job instructions

    • contain functionality to add the resulting image back into the Redis databse so the user can download it

  • The Redis database must support:

    • a ‘raw data’ database holding the Engineering data set in a logical format

    • a ‘hot queue’ for handling job instructions

    • a ‘job results’ database for holding the images generated by the workers

    • a scheme for backing up the database at regular intervals in a way that it can be restored

The project must also include a well-written README following all the guidelines given in previous class assignments. This README should emphasize two sections: instructions for deploying and testing the application, and instructions for using the application.

Other files including Kubernetes configuration files, Dockerfile(s), and a docker-compose.yml file will be expected (see ‘What to Turn In’ below).

Note

A previous version of the above sentence said ‘Makefile(s), and functional test file(s)’ would be expected. That requirement has been changed to ‘a docker-compose.yml file’.

PART 3: Write Up

We are looking for a written document (maybe ~10-11 pages as a PDF) describing the project. The written document should be verbose and targeted towards a non-user, but technically savvy layperson (e.g. one of your fellow engineering students who is not taking this class). Here are some things we will be looking for:

  • Title page. Contains descriptive title, students name

  • Write up contains logical progression of sections with appropriate headers

  • High level description with introduction to the project, describes the motivation

  • Detailed but concised description of the data

  • Key technologies (e.g. Flask, Docker, Kubernetes) are defined at a high level for people who might not know what they are

  • List of route endpoints is easy to read and gives a nice overall picture of the API

  • Usage section shows representative example code snippets - not necessarily exhaustive, but just enough

  • Ethical and Professional Responsibilities section is well thought out

  • Section connecting parts of this project to key software design principles (see Unit 04)

  • Citations page at the end

PART 4: Video Demo

Prepare a < 10 minute video demo of the application. Use zoom to screen share and record your narration of the process. At a minimum, we want to see you deploy the application to Kubernetes, curl various routes to display select data, describe the data that you are showing and the importance of it, curl the appropriate routes to submit an analysis job and retrieve and display the results, and highlight anything else you think is interesting or unique about your application.

What to Turn In

This Final project should be pushed into a standalone repo with a descriptive name. It should not be part of your existing homework repo. A sample Git repository may contain something similar to the following after completing the Final (your filenames may vary):

repo-name/
├── docker
│   ├── docker-compose.yml
│   ├── Dockerfile.api
│   └── Dockerfile.wrk
├── kubernetes
│   └── prod
│       ├── app-prod-api-deployment.yml
│       ├── app-prod-api-ingress.yml
│       ├── app-prod-api-nodeport.yml
│       ├── app-prod-db-deployment.yml
│       ├── app-prod-db-pvc.yml
│       ├── app-prod-db-service.yml
│       └── app-prod-wrk-deployment.yml
├── Makefile
├── README.md
├── requirements.txt
└── src
    ├── flask_api.py
    ├── jobs.py
    └── worker.py

Send an email to wallen@tacc.utexas.edu with the PDF write-up attached plus a link to your new GitHub repository plus a link to download the zoom recording. Please include “Final Project” in the subject line. We will clone all of your repos at the due date / time for evaluation. Only one email per group is required.

Additional Resources

Here are some example sites where you can find suitable data sets. This is not an exhaustive list