"Leaders in business, education and government must take action to foster a new generation of talent with the technical expertise and unique ideas to make the most of this tsunami of Big Data."

-Richard Rodts, Manager of Global Academic Programs, IBM

Level 5 Course Offerings



Engineering & Physical Sciences

ASTR 306 (Physical Sciences)

DSCI 353 (Energy & Manufacturing)

Course description: Data science methods for inference, modeling and prediction.

In this course, we will use an open data science tool chain to develop reproducible data analyses useful for inference, modeling, and prediction of the behavior of real energy and manufacturing systems. In addition to the standard data cleaning, assembly, and exploratory data analysis steps essential to all data analyses, we will identify statistically significant relationships from datasets derived from population samples, and infer the reliability of these findings. We will use regression methods to model a number of both real-world and lab-based systems producing predictive models applicable in comparable populations.

We will assemble and explore real-world datasets, use pair-wise plots to explore correlations, perform clustering, self-similarity, and logistic regression develop both fixed-effect and mixed-effect predictive models. We will also introduce machine-learning approaches for classification and tree-based methods. Results will be interpreted, visualized and discussed.

We will introduce the basic elements of data science and analytics using R Project for Statistical Computing. R Analytics will be applied to the case of energy systems (such as PV power plant degradation, and building energy efficiency) over time. And it will be applied to manufacturing systems to understand the principles of statistical process control and identify critical factors of variability and uniformity.

Learning Outcomes:

Familiarity with an open-data tool chain including R Statistics, scripting, functions, packages, automated data analysis, git versioning and Rmarkdown reproducible data science.

Familiarity with exploratory data analysis to guide data analysis

Familiarity with inference and significance of sample results to populations

Familiarity with regression and linear and non-linear statistical model building

Including training, testing and validating dataset strategies

Applications of domain knowledge and statistical analytics

To identify important predictors and develop initial predictive models

Familiarity with clustering, self-similarity methods

For categorization by different distance metrics

Introduction to machine-learning approaches such as tree-based methods

Data types include:

Time-series, spectral, image and higher order datatypes,

And their assembly to produce augmented and derivative datasets.

Dataset characteristics will include:

Variety: Of types of information, including both structured and unstructured data,

Volume: Data from human sources (vendors, suppliers, distributors, customers, etc.) and

sensor networks of the energy system of factory, both small and large data volumes.

Velocity: Energy system and manufacturing supply chain changes will be included.



SYBB 459 (Translational ADS)

Description of omic data (biological sequences, gene expression, protein-protein interactions, protein-DNA interactions, protein expression, metabolomics, biological ontologies), regulatory network inference, topology of regulatory networks, computational inference of protein-protein interactions, protein interaction databases, topology of protein interaction networks, module and protein complex discovery, network alignment and mining, computational models for network evolution, network-based functional inference, metabolic pathway databases, topology of metabolic pathways, flux models for analysis of metabolic networks, network integration, inference of domain-domain interactions, signaling pathway inference from protein interaction networks, network models and algorithms for disease gene identification, identification of dysregulated subnetworks network-based disease classification. 
Offered as EECS 459 and SYBB 459.

SYBB 322 (Clinical ADS)



BAFI 361 (Finance)

This course is developed based on the feedback received from employers who have hired BS Management (finance) graduates in the past and will likely do so in future.  The goal is to enable students to use financial econometrics to effectively analyze financial data.  The course will draw on theoretical aspects of BAFI 355 but focus on developing financial analytic skills. The applied nature of the course comes from the use of real, rather than theoretical, data.  In other words, in a real-world fashion, through the use of statistical methods to analyze real data, the student can address practical questions of high relevance to the Finance industry. The scope of the data, as well as the quantitative methods used in such analysis, often requires familiarity with computational environments and statistical packages. As such, another goal of the course is to familiarize the student with at least one such environment.

MKMR 308 (Marketing)

Evaluation and control are important strategic marketing processes and without effective and consistent measurement, these processes cannot be performed adequately.  In recent years, marketing budgets have been challenged by top managers as the value of these expenditures to an organization's financial well-being is not often clear.  Marketing activities such as advertising, sales promotions, sales force allocation, new product development, and pricing all involve upfront investments and making these investments now require increasing scrutiny.  This course will be about knowing and understanding what to measure, how to measure, and how to report it so the link between marketing tactics and financial outcomes is clearer.  The course will include lectures by the instructor, readings, cases, computer-based data exercises, and guest lectures.  There will also be a team project requirement.

MKMR 310 (Marketing)

To appreciate, design, and implement data-based marketing studies for extracting valid and useful insights for managerial action that yield attractive ROI, five essential processes are emphasized: (a) making observations about customers, competitors, and markets, (b) recognizing, formulating, and refining meaningful problems as opportunities for managerial action, (c) developing and specifying testable models of marketing phenomenon, (d) designing and implementing research designs for valid data, and (e) rigorous analysis for uncovering and testing patterns and mechanisms from marketing data.
Offered as MKMR 310 and ECON 310.

ECON 327 (Economics)

This class builds on the foundations of applied regression analysis developed in ECON 326.  The goal of the class is to equip students with the tools to conduct a causal analysis of a hypothesis in a variety of settings.  Topics will include causality, panel and time series data, instrumental variables and quasi-experiments, semi- and non-parametric methods, and treatment evaluation.
Offered as ECON 327 and ECON 427.