Author Archives: Charles Nicholson

Graduate Seminar: eBay Machine Learning Engineer!

eBay Machine Learning Graduate Seminar

Weili Zhang was the first analytics lab @ OU student to join the team, the first MS Data Science and Analytics graduate from OU, and will be Dr. Nicholson’s first student to complete his PhD in Industrial & Systems Engineering. He accepted a machine learning job at eBay last year in San Jose, CA, but is back this week to defend his PhD research on Friday, December 8, and then at 4:30p, to give a seminar presentation, open to the public, on machine learning at eBay. I expect this to be a pretty casual meeting and expect that Weili will be open to lots of Q&A and discussion.

It is my great pleasure to invite you to attend the seminar if you can: Friday, December 8, 2017 @ 4:30p in the Carson Engineering Center, Room 117 (map below). Also if you would like to join remotely, you can connect via Zoom:

New Masters 2017!

This week I am very happy to congratulate all of the students completing their Master’s of Science and PhD degrees.

Several of these students are my advisees and I am quite proud of their accomplishments.  As of today, all of my MSc students have defended their work.  And on Friday, my first PhD student will defend his research.  I’ll post the results of that as soon as I have it!

For now, lets focus on the Analytics Lab 2017 new masters!

New Masters and the MSc Research Path

The Master’s thesis student has three major components of their academic path: (1) successful completion of rigorous graduate course work; and (2) an in-depth research effort, spanning one to two years, on an area of specialization that results in the Master’s thesis (usually a 50 to 100 page manuscript detailing the background of the problem, the complexities of work, and their results), and (3) the Master’s defense.

The defense is a presentation to a committee of faculty members, and any others present, the summary of their entire research efforts.  During the defense, the committee members  ask questions relating to any detail of the work.   Questions are aimed at determining whether or not the student truly understands the concepts, methods, and results.  These are often open-ended and require critical, yet on-the-spot, reflection about his or her work.

Most defenses last 30 minutes to 1 hour, but some may exceed 1.5 hours, depending on the questions and student responses.  While the process is not ‘grueling’ per se, it is significant.

Successful defenders…

This semester, I am privileged to participate on 8 MS thesis committees and 2 PhD committees of students completing in December.  Most of the defenses are occurring this week.  So it is a busy week!

However, I am particularly happy about the successful results of 4 of the MS students, since I am their advisor.  Congratulations to Yunjie “Nicole” Wen, Gowtham Talluru, Samineh Nayeri, and Pauline Ribeyre!

Yunjie “Nicole” Wen, Masters of Science in Data Science and Analytics

Thesis: Game theory application of resilience community road-bridge transportation system

Abstract: “This paper considers the problem of game theory application in resilience-based road-bridge transportation network. Bridges in a community may be owned and maintained by separated entities. These owners may have different and even competing objectives for the recovering the transportation system after disaster. In this work, we assume that each player attempts to maximize the efficiency of repair to the system from the perspective of their own damaged damaged bridges after a hazard. The problem is modeled as an N-player nonzero-sum game. Strategic form and sequential form game are designed to demonstrate methodology.  A genetic algorithm is applied to the computation of the problem. The transportation network from Shelby County, TN is used to demonstrate the proposed methodology.”

Nicole will be continuing her academic career by pursing a PhD in Industrial and Systems Engineering at the University of Oklahoma.

Gotham Talluru, Masters of Science in Data Science and Analytics

Thesis: Dynamic Uplift Modelling

Abstract: “A new approach to Uplift modelling which considers time dependent behavior of the customers is analyzed. Uplift modelling (also known as true lift or incremental modeling) has applications in marketing, insurance, banking, personalized medicine, among other fields. The objective of an Uplift model is to identify individual entities who should be targeted for treatment (e.g., a marketing campaign) to maximize the incremental impact overall.

Research to-date has considered this as a static problem modelled at a single instance of time.  The method introduced in this work considers modelling uplift in a dynamic environment.  In particular, I consider a series of direct marketing contacts and simulate  periodic purchasing behavior of customers.  In contrast to static uplift models, the uplift in the purchase probability of the customers is dependent on time as well as customers previous purchases and offers received.  Appropriate modifications are made to static model approaches to adapt them to a dynamic model approach.

This study demonstrates significant potential for both researches and retail companies for thinking about the problem of uplift longitudinally.”

Gowtham has accepted a prestigious job in data science with PricewaterhouseCoopers (PwC) in the Oil and Gas sector of their business.

Samineh Nayeri, Masters of Science in Industrial & Systems Engineering

Thesis: Decomposition algorithm in fixed charge time-space network flow problems

Abstract: “A wide range of network flow problems primarily used in transportation is categorized as time-space fixed charge network flow problems. In this family of networks, each node is associated with a specific time and is replicated across all time-periods. The cost structure in these problems consists of variable and fixed costs where continuous and binary variables are required to formulate the problem as a mixed integer linear programming. and the problem is known to be NP-hard.  When the time dimension is added to the problem, solution approaches are even more time-consuming and CPU and memory intensive.

In this work, a decomposition heuristic is proposed that subdivides the problem into various time epochs to create smaller and more manageable subproblems.  These subproblems are solved sequentially to find an overall solution for the original problem. To evaluate the capability and efficiency of the decomposition method vs. exact method, a total of 1600 problems are generated and solved using Gurobi MIP solver, which runs parallel branch & bound algorithm. Statistical analysis indicates that depending on the problem specification, the average solution time in the decomposition is improved by more than four orders of magnitude and the solutions found are high quality (<2.5% from optimal, on average).”

Pauline Ribeyre, Masters of Science in Industrial & Systems Engineering

Thesis: Finding key characteristics of promising drug compounds for anticancer drug discovery

Abstract: “Multidrug resistance is the simultaneous resistance to two or more chemically unrelated therapeutics, including some therapeutics the cell has never been exposed to. It is one of the biggest obstacles to effective cancer chemotherapy treatments. Multidrug resistance can be caused by drug efflux, an otherwise useful body mechanism that prevents a too-high drug concentration in cells, by using proteins called transporters. Some chemical compounds have the ability to sensitize the cells to the drugs by disabling these transporters. The focus of this work is to find key characteristics of compounds that may disable a specific transporter, the P-glycoprotein. Three datasets listing compounds, their values for different features, and their ability to disable the transporters are provided by experts. Using the programming language R, various data analytics methods are applied to these datasets with the objective of predicting whether compounds are P-glycoprotein inhibitors or not. The main issue encountered is the fact that the most important dataset did not contain enough samples for the number of predictor variables. Ultimately, the decision tree and random forest models prove to be the most effective in predicting the compounds’ ability to disable the transporter.”

Congratulations to all the new masters!  May the force be with you.


Probabilistic Prediction of Post-disaster Functionality

Probabilistic Prediction of Post-disaster Functionality Loss of Community Building Portfolios Considering Utility Disruptions

Journal of Structural Engineering

I am proud to announce that the latest collaborative work from the CORE lab has been accepted for publication in the ASCE’s Journal of Structural Engineering.  The new paper title is a mouthful, “Probabilistic Prediction of Post-disaster Functionality Loss of Community Building Portfolios Considering Utility Disruptions”, but the researchers (Weili Zhang, Peihui Lin, Naiyu Wang, Charles Nicholson, and Xianwu Xue) have been just calling the effort the “PPPD” project.

The study proposes a framework for the probabilistic prediction of building portfolio functionality loss in a community following an earthquake hazard. Building functionality is jointly affected by both the structural integrity of the building itself and the availability of critical utilities.

Post-disaster functionality loss relates to direct damage and critical utilities

To this end, the framework incorporates three analyses for a given earthquake scenario:

  1. evaluation of the spatial distribution of physical damages to both buildings and utility infrastructure
  2. computation of utility disruptions deriving from the cascading failures occurring in the interdependent utility networks; the cascading failures are simulated by use of new mixed-integer, multicommodity network flow optimization model
  3. by integrating (1) and (2), a probabilistic prediction of the post-event functionality loss of building portfolios at the community scale.
Framework for Post-disaster Functionality Loss Prediction

Overview of the PPPD Framework

The framework couples functionality analyses of physical systems of distinct topologies and hazard response characteristics in a consistent spatial scale, providing a rich array of information for community hazard mitigation and resilience planning.

Case Study

An implementation of the framework is illustrated using the residential building portfolio in Shelby County, TN, subjected to an earthquake hazard.  A single realization of an earthquake scenario in Shelby Country, TN is depicted below.

Single realization of post-disaster functionality

Single disruptive event simulation realization

Since the building damage, the flow model, the data collection/aggregation can all be complted efficiently, it is easy to extend the single simulation realization to many realizations.  This allows for a spatial probabilistic analysis of the vulnerabilities in the affected area. The figure below depicts the expected impact to the region based on 1,000 simulations of the scenario earthquake.

Multiple realizations of post-disaster functionality

Expected impact based on multiple earthquake simulation realizations

The intricacies that relate how the electric power network (EPN) support the potable water network (PWN), along with the particular individual component vulnerabilities of the EPN and PWN, produce probabilistic failure patterns in building functionality (see sub-figure d. above), that are not obvious!


Additionally, the framework allows us to compare a more traditional building portfolio analysis to with that of the practical implications of disruptive events.  That is, even if your place of employment is not damaged, if the building does not have power or water, then it will be closed for business anyway!

The green line in the figure to the right denotes the probability of exceedance for the ratio of buildings which cannot be occupied (RUO) due to physical damage.  The dotted line relates to the ratio of functional loss of buildings (RFL) which is due to any combination of direct damage and utility loss.   Clearly, the RUO is a conservative estimate compared to RFL.  For example, there is only a 40% chance that 40+% of the buildings will be directly damaged to the extent of restricted occupancy. However, that number jumps to 80% when the utilities are considered!


This work represents a wonderful collaborative effort within the CORE lab.  Weili Zhang developed the interdependency model and worked closely with Peihui Lin, who provided the building analyses.  And both worked closely with Xianwu Xue, the GIS expert.  And of course, I am always pleased to work with my colleague Naiyu Wang in Civil Engineering.   We have much, much collaborative work already in-progress and planned for the future!

Southwest Airlines Operations Tour

NOC at Southwest Airlines

Southwest Airlines Network Operations Control


Southwest Airlines Visit

(back row, left-to-right) Kyle Beatty, Warren Qualley, Kelvin Droegemeier, Hank Jenkins-Smith, Ed Cokely (front row, left-to-right) Carol Silva, Amy McGovern, Radhika Santhanam, Le Gruenwald, Sridhar Radhakrishnan, Charles Nicholson

I was happy to represent the Analytics Lab recently as a part of a larger team from OU who were invited down to Dallas, TX near Love Field to meet with Southwest Airlines (SWA) to learn more about the airline business and operations.   The attendees from OU included the Vice President of Research; directors from the School of Computer Science in the Gallogly College of Engineering and Management Information Systems in the Price College of Business; senior researchers and specialists from political science, psychology, computer science, and of course, data science.

We were privileged to take a tour of the famous Southwest Airlines Network Operations Control, a.k.a., the NOC.  This facility and the employees who work here are at the very core of the SWA network operations.   From dispatchers to air traffic control specialists to flight operations to maintenance to crew schedulers to weather analysts — this is where the major operational decisions are made.  

The unique look of the NOC, bathed in blue as it is, was designed scientifically to help with mood and to reduce eye strain.  And, well, it simply looks cool.

While we were at the NOC, it so happens that Southwest Airlines was actively engaged in planning for the expected impacts from the impending Hurricane Harvey.  Obviously, weather, and especially major weather events like hurricanes, play a huge role in flight delays and cancellations for all airlines. Such disruptive events can have impacts across across an entire transportation network. Analyzing and optimizing under this larger “system-wide” view is what ISE’s are famous for. These are hard problems, but they are worth solving!

Planning for Harvey at Southwest Airlines NOC

Southwest NOC in action planning for Hurricane Harvey


12th International Conference on Structural Safety & Reliability

Several members of the combined CORE lab at OU attended the 12th International Conference on Structural Safety & Reliability (ICOSSAR 2017) at the Technische Universität Wien in Vienna, Austria during the summer.

The CORE lab had a combined 5 presentations during the conference!


On Tuesday, August 8, Naiyu Wang and Xianwu Xue both gave presentations relating to resilience and climate change:

  • 2:30-2:50p Dresback, K., Xue X., Xu J., Wang, N., Kolar, R., Geoghehan K.  “STORM-CoRe: A coupled model system for hurricanes, storm surge and coastal flooding to support community resilience planning under climate change”
  • 5:20-5:40p Xue X., Wang, N., Ellingwood, B., Zhang, K.  “The impact of climate change on riverine flooding at the community scale”.

On Wednesday, August 9, Yingjun Wang and Charles Nicholson gave presentations in the general resilience section; and Peihui Lin gave a presentation in the section relating to urban resilience:

  • 10:50-11:10 Wang, Y., Wang, N. “Retrofitting building portfolios to achieve community resilience goals under tornado hazard”
  • 11:10 -11:30 Zhang, W., Wang, N., Nicholson, C., Hadikhan Tehrani, M. “Stage-wise resilience planning for transportation networks”
  • 4:40-5:00 Lin, P., Wang, N. “A simulation-based model for post-disaster functionality recovery f community building portfolios”

OU was well represented in Vienna: in addition to the five faculty/students from the CORE lab listed above who traveled to ICOSSAR, there were separate presentations from Kash Barker and Hiba Baroud (PhD from OU, now faculty at Vanderbilt).

Several colleagues from the NIST-funded Center of Excellence on Community Resilience also participated in the conference including Bruce Ellingwood (CSU), John van de Lindt (CSU), Paolo Gardoni (UIUC), and Jamie Padgett (Rice), among others.

For fun

Outside of the conference itself, Vienna was a beautiful and interesting place — museums, history, and incredible architecture.  I was very happy to enjoy the trip t Vienna with my father.  We enjoyed Stephansplatz, a main square at the center of Vienna, named after its Vienna’s amazing cathedral (two pictures below).   Also had a chance to visit the Schönbrunn Palace and gardens.  Finally, I also made a side-trip to work out with Cross Fit Vienna, “The Dungeon”!



Sr. Data Analyst Position – Open in Plano, TX

JCPenney is hiring Sr. Data Analyst

A friend of mine who works at the JC Penney HQ in Plano, TX just sent me a new job posting — she would love to hire an OU DSA student!  See below for job description and let me know if you are interested!   Please note that JCPenney is not doing Visa sponsoring for this position.

Job posting

JCPenney is one of the nations largest apparel and home furnishing retailers with more than 1,000 stores and We are a diverse community of people, all working together to bring sensational style, sensible prices and the best service possible to our customers. Were looking for talented individuals who want to work in an energetic, respectful, collaborative environment. With a wide array of jobs, internships, training and more, there are countless opportunities for you to grow your career with us.

JCPenney is looking for an experienced data analyst who is eager to learn, to add value, and to do interesting work as a valued member of the Customer Strategy team. This position is data-intensive and will involve use of SQL and SAS software tools to pull data for analysis and reporting purposes. Insights produced by this team inform business decisions in Marketing and beyond, including those by senior executive leaders.

Primary Responsibilities:

  • Facilitate the definition of analysis needs and work product requirements of internal clients
  • Translate client needs and requirements into specific data, logic and reporting requirements and realistic work plans
  • Understand and have a working knowledge of customer/transactional level data
  • Strive to structure analysis to provide conclusive insights that directly align to decision-making
  • Prioritize and balance multiple activities in parallel and communicate status proactively to manage stakeholder expectations
  • Understand data sources to determine the correct source(s) and logic to ensure accurate, efficient and timely deliverables
  • Build, run and automate data queries, analysis and reports
  • Speak out when business strategies do not align with data insights and when insights suggest new marketing tactics
  • Identify and log data issues and work with department, IT and vendor teammates to understand and resolve them
  • Proactively seek help from and offer help to JCP teammates to accelerate skill development, business understanding and overall goal achievement
  • Anticipate future insight needs/opportunities and deliver self-initiated value to JCP

Core Competencies & Accomplishments:

  • College graduate with 3+ years of experience
  • At least 2 years experience using databases and SQL (structured query language) and SAS
  • Ability to combine, cleanse and harmonize data for descriptive and predictive analytics
  • Strong math, computer and problem-solving skills, including MS Excel
  • Structured thinker and high attention to detail
  • Strong teamwork, communication and interpersonal skills
  • Desire to consistently meet and exceed stakeholder expectations
  • Desire to acquire new technical skills (e.g., R, Hive, Tableau, Datameer) and business knowledge

Welcome to Fall 2017!

Farewell to Summer

I hope everyone had a great summer and are enjoying the beginning of the Fall 2017 classes begin anew.  I’ve been here most of the summer, and wow! it is great to have the students back — the peace and quiet are nice for a while, but the campus really comes alive in Fall.

My summer included a trip to Disney with the family, a solo climb of two 14,000+ foot mountains in Colorado (Blanca Peak and Ellingwood Point), and trip to Austria for the ICOSSAR 2017 conference.

Welcome to Fall 2017 courses

My classes begin on Tuesday, 8/22 — both DSA/ISE 5103 and ISE 4113.

The DSA/ISE 5103 Intelligent Data Analytics graduate course is one which I think is core to data science.  In it we will  study and practice how to deal with real-world data intensive problems.   The topics include lots of data work and some great modeling techniques/applications such as dimension reduction, facial recognition, linear and logistic regression, LASSO, elasticnet, support vector machines, MARS, decision trees, random forests, boosted trees, neural networks, and clustering.  You will use powerful open source statistical programming language (R) and work on hands-on, applied data analysis projects.  No previous R experienced is required.  That said, I will expect you to work hard to learn the tool!  This course is being offered both online and on-campus.

In the ISE 4113 undergraduate course, we will be delving into the nitty-gritty of MS Excel to build spreadsheet-based decision support systems.  Excel is essentially ubiquitous in industry and mastery of it is critical!  We will go way beyond simple formulas and the basic usage of the tool and delve into optimization modeling, simulation, and even Visual Basic for Applications (VBA) programming.  This class is a big class, but fortunately, we have an excellent TA supporting the class.

DSA club?

One piece of great news is that it looks like there is some interest in starting a Data Science and Analytics club.  I will have more news about this later this semester, but if you are interested in joining such a club, please feel free to email me!  More info to follow!

I  look forward to meeting and getting to know you all this semester!

Charles Nicholson


Congratulations to three new Masters!

Congratulations Alexandra, Emily, and Megan — new Masters of Science!

Profile Picture
Snelling Megan

Alexandra Amidon (left), Emily Grimes (center), and Megan Snelling (right) have all successfully defended their Master’s theses this Spring 2017 are the three newest Masters from the Analytics Lab @ OU.

Alexandra and Emily completed their Masters of Science in Data Science and Analytics from the Gallogly College of Engineering.  The DSA program is a joint effort between the School of Industrial & Systems Engineering and the School of Computer Science. Megan completed her Masters of Science from the School of Industrial & Systems Engineering.

I’ll start with Megan since she is the lone ISE in this group of three.

Megan’s work is entitled “MODEL FOR MITIGATING ECONOMIC AND SOCIAL DISASTER DAMAGE THROUGH STRUCTURAL REINFORCEMENT” and is a continuation of previous work completed as a part of the NIST-funded Center of Excellence on Risk-Based Community Resilience Planning  and CORE Lab @ OU.

Abstract: Natural disasters have both severe negative short-term consequences on community structures, inhabitants, and long-term impacts on economic growth. In response to the rising costs and magnitude of such disasters to communities, a characteristic of modern community development is the aspiration towards resilience. An effective and well-studied mitigation measure, structural interventions reduce the value lost in buildings in earthquake scenarios. Both structural loss and socioeconomic characteristics are indicators for whether a household will dislocate from their residence. Therefore, this social vulnerability can be mitigated by structural interventions and should be minimized as it is also indicator of indirect economic loss. This research presents a model for mitigating direct economic loss and population dislocation through decisions regarding the selection of community structures to retrofit to higher code levels. In particular, the model allows for detailed analysis of the tradeoffs between budget, direct economic loss, population dislocation, and the disparity of dislocation across socioeconomic classes given a heterogeneous residential and commercial structure set. The mathematical model is informed by extensive earthquake simulation and as well as recent dislocation modeling from the field of social science. The non-dominated sorting genetic algorithm II (NSGA-II) is adapted to solve to model, as the dislocation model component is non-linear. Use of the mitigation model is demonstrated through a case study using Centerville, a test bed community designed by a multidisciplinary team of experts.  Details of the retrofit strategies are interpreted from the estimated Pareto front.

We should also offer congratulations to Megan on another account she is getting married soon and plans to spend her Summer hiking through Europe!

Alexandra and Emily both worked on project related to T.U.G. (The Untitled Game) which was partially funded by Nerd Kingdom.Nerd Kingdom


Abstract: Predictive algorithms applied to streaming data sources are often trained sequentially by updating the model weights after each new data point arrives. When disruptions or changes in the data generating process occur (“concept drifts”), the online learning process allows the algorithm to slowly learn the changes; however, there may be a period of time after concept drift during which the predictive algorithm underperforms. This thesis introduces a method that makes online neural network classifiers more resilient to these concept drifts by utilizing data about concept drift to update neural network parameters.

Alexandra has accepted a position with MSCI, a leading provider of investment decision support tools worldwide, as a Reference Data Production Analyst.  She will be using her skills in machine learning to continue developing new tools for anomaly detection.


Abstract: Player engagement is a concept that is both vital to the online gaming industry and difficult to define. Typically, engagement is defined using social science methodologies such as observing, surveying, and interviewing players. With the vast amount of data being collected from video games as well as user bases increasing in size, it is worthwhile to investigate whether or not user engagement can be defined and interpolated from data alone. This study develops a methodology for defining engagement using analytic methods in order to approach the question of whether gathering (as a proxy for social interaction) in sandbox games has an effect on player engagement.

Emily is following up on leads for a full-time position now, but in the meantime she has a road trip planned to the Grand Canyon, Sequoia National Park, and the Big Sur in California.  She is also in discussions with KGOU and NPR about starting a new radio program!

Congratulations to all three excellent students!  We wish you great success!

Emily Grimes, MS DSA, May 2017

Megan Snelling, MS ISE, May 2017

OU Industrial & Systems Engineering and Data Science & Analytics

Public Webinar Announcement: Center for Risk-Based Community Resilience Planning

Public Webinar Announcement — Community Resilience: Modeling, Field Studies and Implementation

Learn more about NIST-funded Center for Risk-Based Community Resilience Planning and how the Center is developing a computational environment to help define the attributes that make communities resilient.

WEBINAR: Thursday, April 27, 10:00 a.m. – 12:00 p.m. (CDT)  

The webinar is open to anyone immediately followed by a Q&A “chat” period.

A Resilient Community is one that is prepared for and can adapt to changing conditions and can withstand and recover rapidly from disruptions to its physical and social infrastructure.  Modeling community resilience comprehensively requires a concerted effort by experts in engineering social sciences and information sciences to explain how physical, economic and social infrastructure systems within a real community interact and affect recover efforts.

Join this information WEBINAR to learn more about the Center’s recent activities.

A Center overview will be followed by a session on the Center’s recent Special Issue of Resilient and Sustainable Infrastructure, which features six papers on the virtual community Centerville.  The modeling and analysis theory behind each paper will be explained followed by a demonstration of IN-CORE, the Interdependent Connected Modeling Environment for Community Resilience.  Presentations on the first validation study, the Joplin Hindcast, and the Center’s First Field Study, the 2016 Lumberton floods in NC will also be a highlight of the Webinar.

No registration is required this time, just click, watch, and chat.

Both Dr. Nicholson and Dr. Wang will be giving presentations during the webinar.

Flier for distribution: Webinar Flier 27-April-2017

Postdoctoral Research Fellow Position in Community Resilience

Prof. Charles Nicholson is currently accepting applications for a postdoctoral research fellow position in Community Resilience within the School of Industrial and Systems Engineering at the University of Oklahoma.

The primary area of research is with respect to the following broad objective:

Enhance community resilience to natural and man-made disasters through modeling, optimization, and risk-informed decision making with respect to vital, large-scale, interdependent civil infrastructure and socio-economic systems.

Researchers with backgrounds and interests in one or more the following areas are encouraged to apply:

  • Optimization: network flow optimization, multi-objective optimization, stochastic optimization; stochastic programming
  • Data science and analytics: including machine learning for predictive and classification modeling as well as unsupervised and semi-supervised learning
  • Decision modeling for community and regional resilience planning

The postdoctoral research fellow will embark on an exciting and innovative research program within a well-established and active multidisciplinary research group with collaboration opportunities across the United States.  In this role, you will also supervise one or more PhD students.  Experience with tools such as Python or R is highly preferred.  Familiarity with Civil Infrastructure systems and/or economic modeling is a plus.  The position will be supported by funded research projects with multi-year durations.

Interested applicants please send a one-page statement of research interests and CV to cnicholson @ OU (dot) edu.