My research aims to improve the quality of student learning. I analyze behavioral data captured by course delivery platforms to develop models for learning behavior, and leverage the resultant models in the design of next generation learning technologies.

My work is interdisciplinary, involving computer science, engineering, and pedagogy/education innovation. It also aims to bridge the theory-practice divide: I use real-world data and experiences from my own teaching in the development of models and prototypes, and work in industry to incorporate my algorithms in systems deployed to various learning scenarios.

My investigation since 2013 has focused on three areas in particular: Big Learning Data Analytics, Social Learning Networks, and Integrated and Individualized Courses. More information on each of these thrusts is contained below.

Big Learning Data Analytics (BLDA)

Learning technology platforms today can be equipped with infrastructure to capture fine-granular behavioral data about students as they proceed through a course. This includes, for example, the sequence of clicks made while watching a video lecture, answering a quiz question, or posting in a discussion. The resultant “big” data presents unprecedented opportunities to study the process by which learning occurs.

Big Learning Data Analytics (BLDA) involves developing methodology to extract patterns from student behavioral data, and using the insights to design algorithms for learning analytics and individualization. It has a plethora of research questions at its core, such as: What is the most effective way to model the learning process, such that it leads to keen analytic insights? How can we identify the relationship between student engagement and performance? Are there hidden dimensions that dictate how users respond to different types of information?

I have investigated BLDA using millions of clickstream logs generated from tens of thousands of students that have taken our MOOC courses. In particular, I have developed different methods for representing the behavior students exhibit while watching lecture videos, and have used the insights in the design of algorithms to predict student performance on quiz questions in advance. These behavior-based algorithms have superior quality to more conventional collaborative filtering methods that rely solely on past quiz performance.

Social Learning Networks (SLN)

A Social Learning Network (SLN) emerges when people exchange information on educational topics through structured interactions, constituting an important form of learning in any scenario. As enrollment scales up, it becomes increasingly difficult for instructors to accommodate individual learning needs, which places an even larger importance on peer-based learning.

Research on SLN consists of modeling the social networks emerging in learning scenarios, and designing algorithms to recommend better social interactions. Some driving questions include: How can we model the process by which students exchange information? What is the right way to quantify the efficiency of an SLN, in terms of the utility that it is bringing to the users as a whole and to each individual in the network? In what ways can we use information about the current network structure to influence the SLN towards one that will bring about more benefit to students?

I have studied the SLN emerging in discussion forums, which are the primary means for student-to-student and instructor-to-student interaction in MOOCs. I have considered topics ranging from the characteristics that affect decline rates across dozens of courses, to ways of quantifying and optimizing the interaction structure among students within specific courses. In particular, I have found that the efficiency of existing interactions can be rather low, suggesting that much can be gained through methodology for optimizing SLN and recommending specific communications. I have also found that the benefit from a student's tendency to consume versus disseminate information differs qualitatively based on the type of course.

Integrated and Individualized Courses (IIC)

Adaptive Educational Systems (AES) - which adjust a student’s learning path through a course based on a user model maintained by the system - have demonstrated the potential of improving learning outcomes in traditional classroom settings. The potential to integrate multiple modes of learning into courses, and to collect fine-granular behavioral data about students as they interact with these modes, brings the opportunity to develop more sophisticated user models to drive individualization in AES. These models can simultaneously be used to generate more intelligent learning analytics.

An Integrated and Individualized Course (IIC) is a new type of course that can integrate a variety of different learning modes into the instructional process, and perform automated individualization based on behavior collected about students as they interact with these modes. Together with a substantial effort from UI designers and software developers, I have created two systems to support the delivery of IIC: (i) a student-facing course delivery platform, and (ii) an instructor-facing analytics dashboard. Trials run with IIC in different learning scenarios have witnessed promising results in terms of improved student learning outcomes through behavior-based individualization, and the usability of fine-granular behavioral insights by instructors.

There are many interesting research questions about IIC that remain, such as: How can we quantify the benefit of behavior-based individualization over what is possible through performance-based individualization? How can we reduce (or eliminate) the need for the instructor to provide upfront inputs about the relationship between the content and the learning topics to enable individualization? Through the ongoing deployment of IIC technology to different learning scenarios, I am continually investigating these (and other) important questions.