FIRE171: Capital One Machine Learning - Spring 2018
FIRE171 is the second-semester First-Year Research & Innovation Experience (FIRE) course for students enrolling in the FIRE Capital One Machine Learning Stream. The FIRE171 experience is designed to engage and immerse students in authentic research and scholarship.
FIRE171 is a collaborative research engagement focusing on the development and deployment of Machine Learning. Machine learning is a subdiscipline of computer science focused on the capacities for computer systems to learn without being purposefully programmed to perform specific tasks. These capacities derive from pattern recognition and artificial intelligence (AI). The Capital One Machine Learning research stream will focus on using machine learning to develop algorithms involved in predictive data analytics using both supervised and unsupervised approaches.
This course will focus on the concepts related to the process of independent research, including collaboration with peers, communication of ideas, troubleshooting unexpected outcomes, and discipline-specific methodologies. Scheduled class meetings will 1) introduce students to the process and methods of machine learning, both supervised and unsupervised, 2) engage students in the discussion of primary literature, and, 3) continual review of individual and team research progress and troubleshooting research issues. Lab sessions in the research space will focus on training in current discipline-specific methods and practices, giving students relevant experiences that seek to build resiliency and critical analysis skills.
By the end of the semester, the student should be able to understand and apply (subject to change as the semester progresses):
Data Understanding & Preprocessing
- Dataset and data types
- Attributes and attribute types
- Data sampling
- Data description
- Data visualization
- Data errors and correction
- Data scaling
- Data reduction and feature selection
- Data importing
- Data encoding and transformation
- Motivation and types of unsupervised learning
- Measures of similarity and dissimilarity
- K-means clustering
- Hierarchical clustering
- Density-based clustering
- Graph-based clustering
- Motivation and types of supervised learning
- Decision trees
- Rule-based classifiers
- Bayesian classifiers
- Linear regression
- Support vector machines
- Model selection, evaluation, cross validation
- Ensemble methods
- Motivation and types of deep learning
- Artificial neural networks
- Convolutional neural networks
- Recurrent neural networks
Lectures & Videos
Lesson 1 - About the Capital One Machine Learning Research Stream
Lesson 2 - Data, Attributes, and Data Quality
Lesson 3 - Data Understanding and Reduction
Lesson 4 - Data Scaling and Similarity
Lesson 5 - Supervised Learning: Basic Concepts and Performance Measures
Lesson 6 - Supervised Learning: Neural Networks and Deep Learning
Lesson 7 - Unsupervised Learning: Clustering and Evaluation Techniques
Lesson 8 - Ensemble Methods
Lesson 9 - Kaggle InClass Competition Walkthrough
Lesson 10 - Research Proposal and Literature Review
Lesson 11 - Proposal Presentation and Discussion
Lesson 12 - Computer Vision and Convolutional Neural Networks
Lesson 13 - Sequence Models and Recurrent Neural Networks
Lesson 14 - Computer Vision (Guest Lecture by Prof. Larry Davis)
Lesson 15 - Overview of What’s Next
References & Resources
Some of the diagrams and teaching materials are based on references and resources that are not mine. Here are the references and resources you might find helpful for this course (and hopefully beyond).
- CSE 40647/60647: Data Mining, Everaldo Aguiar and Reid Johnson, University of Notre Dame
- Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, and Vipin Kumar
- CS231n: Convolutional Neural Networks for Visual Recognition, Fei-Fei Li et al.
- Analytics Vidhya
- Towards Data Science
- Data Scientist with Python, DataCamp
- Deep Learning Specialization, Andrew Ng et al., Coursera
- Deep Learning with Python, François Chollet, Manning Publications Co.
Here's a photo of who we are:
Capital One offers a broad array of financial products and services to consumers, small businesses and commercial clients in the U.S., Canada and the UK. Capital One is a major partner in the design and creation of the FIRE stream Capital One Machine Learning .
DataCamp offers interactive R and Python courses on topics in data science, statistics, and machine learning. Learn from a team of expert teachers in the comfort of your browser with video lessons and fun coding challenges. DataCamp is a sponsor in providing interactive video lessons for the FIRE stream Capital One Machine Learning.