FIRE171: Capital One Machine Learning - Spring 2018

FIRE171 is the second-semester First-Year Research & Innovation Experience (FIRE) course for students enrolling in the FIRE Capital One Machine Learning Stream. The FIRE171 experience is designed to engage and immerse students in authentic research and scholarship.

FIRE171 is a collaborative research engagement focusing on the development and deployment of Machine Learning. Machine learning is a subdiscipline of computer science focused on the capacities for computer systems to learn without being purposefully programmed to perform specific tasks. These capacities derive from pattern recognition and artificial intelligence (AI). The Capital One Machine Learning research stream will focus on using machine learning to develop algorithms involved in predictive data analytics using both supervised and unsupervised approaches.

This course will focus on the concepts related to the process of independent research, including collaboration with peers, communication of ideas, troubleshooting unexpected outcomes, and discipline-specific methodologies. Scheduled class meetings will 1) introduce students to the process and methods of machine learning, both supervised and unsupervised, 2) engage students in the discussion of primary literature, and, 3) continual review of individual and team research progress and troubleshooting research issues. Lab sessions in the research space will focus on training in current discipline-specific methods and practices, giving students relevant experiences that seek to build resiliency and critical analysis skills.

Learning Outcomes

By the end of the semester, the student should be able to understand and apply (subject to change as the semester progresses):

Data Understanding & Preprocessing

  • Dataset and data types
  • Attributes and attribute types
  • Data sampling
  • Data description
  • Data visualization
  • Data errors and correction
  • Data scaling
  • Data reduction and feature selection
  • Data importing
  • Data encoding and transformation

Unsupervised Learning

  • Motivation and types of unsupervised learning
  • Measures of similarity and dissimilarity
  • K-means clustering
  • Hierarchical clustering
  • Density-based clustering
  • Graph-based clustering

Supervised Learning

  • Motivation and types of supervised learning
  • Decision trees
  • Rule-based classifiers
  • Bayesian classifiers
  • Linear regression
  • Support vector machines
  • Model selection, evaluation, cross validation
  • Ensemble methods

Deep Learning

  • Motivation and types of deep learning
  • Artificial neural networks
  • Autoencoders
  • Convolutional neural networks
  • Recurrent neural networks

Lectures & Videos

Lesson 1 - About the Capital One Machine Learning Research Stream

Lesson 2 - Data, Attributes, and Data Quality

Lesson 3 - Data Understanding and Reduction

Lesson 4 - Data Scaling and Similarity

Lesson 5 - Supervised Learning: Basic Concepts and Performance Measures

Lesson 6 - Supervised Learning: Neural Networks and Deep Learning

Lesson 7 - Unsupervised Learning: Clustering and Evaluation Techniques

Lesson 8 - Ensemble Methods

Lesson 9 - Kaggle InClass Competition Walkthrough

Lesson 10 - Research Proposal and Literature Review

Lesson 11 - Proposal Presentation and Discussion

Lesson 12 - Computer Vision and Convolutional Neural Networks

Lesson 13 - Sequence Models and Recurrent Neural Networks

Lesson 14 - Computer Vision (Guest Lecture by Prof. Larry Davis)

Lesson 15 - Overview of What’s Next

References & Resources

Some of the diagrams and teaching materials are based on references and resources that are not mine. Here are the references and resources you might find helpful for this course (and hopefully beyond).

People

Here's a photo of who we are:

FIRE COML Spring 2018 photo

Instructor

Raymond Tu

Peer Mentors

Allison Buller; Kathleen Hamilton; Yan Wang