M.S. Thesis — M.S. Thesis, University of Manitoba (Canada), September, 2016

Delivering Scalable Frequent Pattern Mining for Non-Expert Data Miners

  Zhao Han

, ,
BundleVis. A frequent itemset visualization. Course 472 is clicked.
BundleVis. A frequent itemset visualization, part of the thesis. Course 472 is clicked.

Abstract

As a popular data mining task, frequent pattern mining has been proven to be help- ful for non-experts. For example, mining frequent purchased products helps store managers increase sales. As another example, finding popular courses assists uni- versity administrators arrange courses to avoid schedule conflicts. However, many data mining researchers have focused on improving algorithmic efficiency, but have put less focus on providing non-experts with a system designed specifically for these non-experts.

In my M.Sc. thesis, I propose such a system, called PatternShow, which consists of (i) a user-friendly frontend web interface along with a visualization tool called BundleVis to show effectively frequent patterns for non-expert miners and (ii) a cloud-enabled backend that offers scalable frequent pattern mining. Results of my user study show the effectiveness of PatternShow in delivering scalable frequent pattern mining for non-expert data miners.

PatternShow

The PatternShow is published in BigDAS 2015.

BundleVis

BundleVis is a visualization tool to show effectively frequent patterns for non-expert miners.

Initial visualization for the course dataset
Initial visualization for the course dataset
BundleVis. A frequent itemset visualization. Course 472 is clicked.
The frequent itemset visualization after course 472 is clicked.
472 and 435 clicked
The frequent itemset visualization after course 472 and 435 both are clicked.

Slides

Zhao Han MSc thesis defense presentation first-slide
Zhao Han MSc thesis defense presentation: first slide.