UAH
Computer Science
Research Project

 

 

 

Finding Frequent Webpage Access Patterns

Upcoming Events: TESTING PHASE....

 

Finding frequent webpage access patterns

Ashley Plier  and  Jennifer Cuzzort
The University of Alabama in Huntsville

   In this project, we are investigating frequent webpage access patterns. We know each webpage has a unique URL. Therefore, the set of all URLs over a pre-specified domain can be viewed as a finite set from the alphabet Σ. When users access web pages within this domain, there are two scenarios: in the first scenario, users never use the Back button of their browsers. Then each internet access corresponds to a string over alphabet Σ. Different accesses correspond to different strings over the alphabet Σ. The history of all user webpage accesses can be considered a sequential database of such strings. In the second scenario, users do use the Back button. Then each internet access corresponds to a tree. Similarly, the history of all the webpage accesses can be viewed as a tree database.

     Given a database of such an access history, we want to determine the webpage access patterns:  that is, the most frequently accessed web pages, and how they are accessed together. These patterns can be used to analyze the utility and interrelatedness of these web pages.  Using the above sequential database or tree database representing the webpage access history, what we really need to find are frequent substring or sub-tree patterns over the sequential database or the tree database. Our project goal is to implement algorithms which find all such frequent access patterns from the given databases.

    We divided this project into two phases. In the first phase, we will implement an algorithm mining the frequent substrings over a sequential database. In the second phase, first we will use an encoding scheme to encode trees into strings. Then we apply our algorithm from the first phase to find frequent substring candidates. Finally we implement an algorithm to check whether these candidate strings indeed correspond to sub-trees of the given tree databases.

 

Comment Book | Add a Comment | Search | References | Implementation_Code.htm

Copyright © 2006 CSRP
All Rights Reserved