Data Mining: MIS 382N.9
Professor Maytal Saar-Tsechansky
Tuesday, Thursday 3:30pm-5:00pm (UTC 1.146)
Course Overview
|
Data Mining is an applied course introducing popular data mining methods for extracting intelligence from business data. Organizations today are applying data mining methods to identify and appeal to higher-value customers, customize their product offerings, or minimize losses due to erroneous decision-making and fraud. The role of data-driven intelligence is becoming increasingly critical and a rigorous understanding of the methods and application can provide managers and IS professionals with important tools to improve decision-making. Topics and related methods discussed in the class include personalization, customer relationship management, intelligent marketing, risk management, web mining and operations. We will discuss the inner workings of the methods to the level necessary to develop an understanding of when and how to use each technique. Students would also acquire hands-on experience working in teams and using state-of-the-art software to develop data mining solutions to business problems. |
|
Textbook ( available at the bookstore) Some reading materials are available in postscript format or adobe acrobat format. One Postscript viewer is Ghostscript. To read .pdf files you need Adobe's Acroread.
Articles featuring data mining :
Additional readings are available for download from the course syllabus table (see below) or will be otherwise distributed in class. WEKA Software and Documentation WEKA is an open-source Machine Learning software which we will use in class The WEKA web site (includes software download & documentation)
|
Office Hours
By appointment, CBA, Room 5.238
Teaching Assistant: Michelle Hsuan-Wei Chen (MICHELLEHWCHEN@MAIL.UTEXAS.EDU)
Course Requirements and Grading
|
Style This is a lecture-style course, however student participation is important. Students are required to be prepared and read the material before class. Students are required to attend all sessions and discuss with the instructor any absence from class.
Assignments and Projects You will hand-in a weekly (individual) write ups. Answers should be well thought out and concise. Assignments must be submitted by the due date.
Late assignments Turn in your assignment early if there is any uncertainty about your ability to turn it in on the due date. Assignments up to one week late will have their grade reduced by 50%. After one week, late assignments will receive no credit.
Projects There will be a team project (maximum two students per team). Students will address business problems with data mining techniques. Students will hand in a brief report (accounts for 80% of project grade) and prepare a short class presentation of their work (20% of project grade). A class discussion will follow the presentations. There will be no final exam.
|
Grade breakdown:
1. Involvement : 10%
2. Assignments : 30%
3. Case study : 20%
4. Team project: 40%
Tentative Course Schedule
|
Date |
Topic |
Readings
|
Assignments due |
|
January 17 |
Introduction to the course. Introduction to data mining.
|
Chapters 1 & 2 | |
|
January 19 |
Introduction (Contd.)
|
Chapters 1 & 2 | |
|
January 24 |
Fundamental concepts and definitions
|
||
|
January 26 |
Classification: Recursive partitioning & Decision Trees
|
Ch 2 pp. 39-42 (revisit),
Ch. 6 pp. 165-194, 209. |
Question set #1 |
|
January 31 |
Classification: Recursive partitioning & Decision Trees (Contd.) |
Question set #2 | |
|
February 2 |
Classification: Recursive partitioning & Decision Trees (Contd.) | ||
|
February 7 |
Model Evaluation |
||
|
February 9 |
Association Rules and Sequential Patterns. Personalization: K-Nearest Neighbor Classification Algorithm |
Pages 287-315 Chapter 8: pp.257-271 |
Question set : Model Evaluation |
|
February 14 |
Lab session: WEKA
|
Hands on #1: Installing and running WEKA
|
|
|
February 16 |
Personalization: Collaborative filtering
|
|
|
|
February 21 |
Clustering Analysis
|
Chapter 11: 349-365 |
Hands-on exercise with WEKA #2 Question set: collaborative filtering |
|
February 23 |
Clustering Analysis WEKA - Lab session |
||
| February 28 |
WEKA Lab
session Bayesian learning with applications to spam filtering |
Chapter 8: pp.257-271 Download supplement reading from Blackboard |
Team project proposal. |
|
March 2 |
Guest Speaker: Dr. David Moriarty David Moriarty is the Director of Data Mining at Apple Computer, where he leads a group of scientists developing analytic solutions to large-scale business problems. Specifically, Dr. Moriarty leverages data patterns to optimize strategic decisions in various business areas, including fraud detection, product quality, logistics, and sales. Dr. Moriarty received a M.S. and Ph.D. in computer science from the University of Texas at Austin specializing in artificial intelligence and machine learning. He regularly serves on journal and conference review committees and is a founding member of Merchant Risk Council. Before Apple Computer, David designed intelligent algorithms at the Naval Research Laboratory, Daimler-Chrysler Research Center, USC Information Sciences Institute, and Intelligent Technologies Corporation.
|
Question set: clustering
|
|
|
Week of March 6th |
Team work on class project (in class) | ||
|
Week of March 13th |
Spring Break |
||
|
March 21 |
Bayesian learning with applications to spam filtering |
Chapter 8:
pp.257-271
|
|
|
March 23 |
Guest speaker: Dr. Pramod Singh of HP Dr. Singh manages the Global Analytics Solutions group in the Information Technology organization at Hewlett-Packard. He leads a team of data-miners and solution architects and is responsible for the development and deployment of analytics solutions for HP. Dr. Singh has been with HP for over 5 years. Prior to joining HP, Dr. Singh has worked for Wal-Mart’s Information Systems Division in several areas using data mining to support assortment planning, customer segmentation and market basket analysis.
Dr. Singh received a Ph.D and M.S. degree in Mathematics from The University of Arkansas and an MBA from The University of Jammu. Dr. Singh is the author of research papers and patents and has presented his work in various conferences.
|
||
|
March 28 |
Team
presentation of Harrah's Case. Class discussion
|
By Sunday, 3/26/2006 Submit executive reports of Harrah's case. Prepare questions for other teams. |
|
|
March 30 |
Team
presentation of Harrah's Case (Contd.) Genetic Algorithms |
||
|
April 4 |
Genetic Algorithms (Contd.)
Time permitted: Lab session in class - team work on term project
|
Question set : Spam filtering | |
|
April 6 |
Guest Speaker: Dr. Ahmet Kuyumcu, Zilliant Dr. H. Ahmet Kuyumcu is an independent pricing consultant and has over 10 years of practical experience on delivering data-driven, technology-based pricing solutions across variety of Fortune 500 firms. He is currently engaged with a major gaming company to combine their pricing, revenue management, and promotion management activities. Dr. Kuyumcu was the chief pricing scientist at Zilliant, Inc. and led the science aspects of the product development efforts. Prior to Zilliant, he pioneered innovative price optimization algorithms for media, travel-transportation, and multi-family housing industries at Manugistics, Inc. Dr. Kuyumcu frequently speaks at conferences on the practice of pricing and revenue management and has published several articles in academic journals. He also teaches a graduate-level class in pricing and revenue management at University of Texas at Austin. Dr. Kuyumcu is a board member of revenue management and pricing section of INFORMS. He has M.S. and Ph.D. degrees in Operations Research from Texas A&M University.
Pricing and Revenue Management and its Hotels and Gaming ResortsPricing and Revenue Management (PRM) use historical data and mathematical models to predict customers’ behavior at a micro-market level and optimize product availability and and/or price to maximize revenue and profit. PRM was first applied to the airline industry shortly after the U.S. Congress passed the Airline Deregulation Act of 1978. Since deregulation, PRM has generated billions of additional dollars for many industries including, but not limited to airlines, hotels, car rental firms, cruise lines, media companies, utility firms, apartments, railroads, wholesalers, and manufacturers. Although PRM concepts are similar across different industries, their application varies significantly. This talk gives a brief overview of pricing and revenue management problems and provides a real-world application for hotel/gaming industries. |
Kuyumcu, H. A. (2002) Gaming Twist in Hotel Revenue Management. Journal of Pricing and Revenue Management, 1, 161-168 | |
|
April 11 |
Artificial Neural Networks |
Question set: Genetic Algorithms | |
|
April 13 |
Neural Networks
|
||
|
April 18 |
Ensemble models Related Technologies |
||
|
April 20 |
Guest Speaker: Dr. Gerald Fahner, Fair Isaac
Dr. Gerald Fahner is Analytic Science-Director at Fair Isaac Corporation’s Core Analytic R&D group, where he develops innovative analytics for business prediction and decision problems. Before joining Fair Isaac, he served as a researcher in machine learning and robotics. Gerald received a Physics diploma from University of Karlsruhe and earned his Computer Science doctorate from University of Bonn, Germany.
|
TBA | |
|
April 25 |
Preparation for final project (class consultation) | ||
| April 27 | Preparation for final project (class consultation) | ||
| May 2 | Team projects - presentations and discussion | Final projects are due | |
|
May 4 |
Team projects - presentations and discussion |