Improving Structure and Links for Analysis of User Navigation Sessions
Bhagyashri Biranje, Priti Harpale, Dushyant Singh, Meenal Jadhav, Sindhu M.R.
In this paper, we are designing a well-structured website to facilitate effective user navigation to relink webpages to improve navigability using user navigation data. We will completely reorganized new structure of a website, which can be highly unpredictable. It will also illustrate how to improve a website without introducing substantial changes. Here specifically, we are using a data mining algorithm to improve the user navigation on a website while minimizing alterations to its current structure. We will also test the model on large synthetic data sets to demonstrate that it scales up very well.
Key words: Website design, user navigation, web mining, data mining algorithm.
Corresponding Authors: Meenal Jadhav, Sindhu M.R.
Nowadays, the more use of internet provides people more knowledge and information in detail. Number of users of internet is increasing day by day. For a user finding desired information is not always easy. Therefore, designing effective websites is not easy. Even though there are high profile websites, if they are unable to view the desired information, then users will ignore these websites. Ratings will be down. Less numbers of users will be the visitors .
Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
There are so many examples of user navigation. The restaurant services such as making reservations, processing orders, and delivering meals generally requires waiters to input customer information and then transmit orders to the kitchen for menu preparation. When the customer pays the bill, the amount due is calculated by the cashier. Although this procedure is simple, it may significantly increase the waiters’ workload and even cause errors in menu ordering or in prioritizing customers, especially when the number of customers suddenly increases during busy hours, which can serious degrade overall service quality.
Websites are having difficulties in searching and locating the target pages, because of poor website design. To design a website, developers should understand how to construct a website, which is different from previous website structure. This will be useful in such cases where users were unable to search or locate the desired information. So, to avoid such problems is not easy while creating a website. Because web developers may or may not have proper understanding of user’s preferences and they organize pages on their own preferences of their own judgments. Therefore to fulfill the user’s need, the webpages should be organized in such a way that it should be match with user’s preferences .
The success of any organization of web site will be determined largely by how well site’s information architecture matches users’ expectations. A logical, consistently named site organization allows users to make successful predictions about where to find things. Various methods of organizing and displaying information permit users to extend their knowledge from familiar pages to unfamiliar ones. If a developer misleads users with a structure that is neither logical nor predictable, or constantly uses different or ambiguous terms to describe site features, users will be frustrated by the difficulties of getting around and understanding what you have to offer . Developer don’t want user’s mental model of web site to look like fig.1.
Fig.1 Confusing links are made by a developer.
Don’t make such a confusing web of links. Designers aren’t the only ones who make models of sites. Users try to imagine the site structure as well, and successful information architecture will help the user build a firm and predictable mental model of your site .
If existing site has more than a few dozen pages, your users will expect web search options to find content in the site. In a larger site, with maybe hundreds or thousands of pages of content, web search is the only efficient means to locate particular content pages or to find all pages that mention a keyword or search phrase.
For example, as with popular books at the library or the hit songs on iTunes, content usage on large web sites is a classic “long-tail” phenomenon : a few items get 80 percent of the attention, and the rest get dramatically less traffic. As the user’s needs get more specific than a browser interface can handle, search engines are the means to find content out there in the long tail where it might otherwise remain undiscovered (fig. 2).
Fig.2 The “long tail” of web search.
Large sites are just too large to depend solely on browsing. Heavily used pages are likely to appear on browsing menus pages.
Website Structure: In this project, the website structure consists of three components: layout templates, URL patterns, and linkage structure.
Most web pages consist of HTML elements like table, menu, button, image, and input box. The layout of a web page describes what HTML elements are included in the page, as well as how these elements are visually distributed in page rendering. Essentially, a page layout is represented by a so called DOM (Document Object Model) tree. In this project, a layout template is considered as a group of pages which have very similar layouts (DOM trees) .
In a website, pages are generated based on distinguishable templates according to their functions. That is to say, visually similar pages usually have same function. In this way, user can easily identify a page’s function at a glance.
- (b) (c)
Fig. 3 Typical layout templates from the ASP.NET Forums .
Following are several typical layout templates identified from the ASP.NET Forums . Their functions are to show a) a list of discussion thread, b) a list of thread posts, and c) user profile, respectively. They are designed to show: a) a list of discussion thread, b) a list of thread posts, and c) user profile, respectively.
A URL pattern is a generalization of a group of URLs sharing similar syntactic format. In general, a URL pattern can be represented with a regular expression.Following we show some example URL patterns discovered, again,from the ASP.NET Forums .
- List-of-thread pages
- List-of-post pages
- User profile pages
Itis noticed that one layout templates can have more than one related URL pattern. For example,a bookseller website usually designs one template to show a list of books,andprovidesdifferent query parameters to generate such a list. Various query parameters in this scenario will lead to different URL patterns, but the search results are shown with the same template. Another common case is duplicate pages, i.e., pages with the same content (and very likely the same layout)but different URLs .
Based on the layout templates and URL patterns, we can construct a directed graph to represent the website organization structure. That is, each layout template is considered as a node in a graph, and two nodes are linked if there are hyperlinks between the pages belonging to the two nodes. The link direction is the same as the related hyperlinks.And each link is characterized with the URL pattern of the corresponding hyperlink URLs. Again, it should be noticed thatthere could be multiple links from one node to another if the corresponding hyperlinks have more than one URL pattern.
Fig. 2 gives an illustrative example ofthe sub-graph constructed based on the layout templates and URL patterns above.
Fig.4 An illustrative sub link-graph for the ASP.NET Forums .
In our proposed system, we have two main modules- Client and Server.
1] Module 1: Client
Client has two functions:
- Browse Website
- Submit User Experience
User or client used to browse the website. The client browses the website. The information of user such as history, time of visiting website, links etc. is known as user’s experience. It is then submitted to server.
2] Module 2: Server
We are using tomcat apache as a Server. Client and Server are connected through network using Servlet. Server is used to store activity log of all user’s based on session in the database. Using Data Mining algorithm this database, website can be restructured to provide better, easier and faster interfaces.
Fig. 7 Architecture
In this architecture, there are two modules, client and server. Client browses the website and submits its experience to the server. Then data is stored into the database. Data mining algorithm is applied to get improved website structure. It improves a website rather than reorganizes it hence is suitable for website maintenance on a progressive basis. This model is very effective to real-world websites. It optimally solves large-sized problems in a few seconds in most cases on a desktop.
Data Mining Algorithm:
The Data mining algorithm we are using is K-Means. The Algorithm K-means (Mac Queen, 1967) is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priority [4, 5].
The algorithm is composed of the following steps:
1. Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids. 2. Assign each object to the group that has the closest centroid. 3. When all objects have been assigned, recalculate the positions of the K centroids. 4. Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation of the objects into groups from which the metric to be minimized can be calculated.
This model is useful for static websites and not suitable for websites that purely uses dynamic pages or have volatile contents. Using Data mining algorithm we will improve the navigation effectiveness of a website with minimal changes to its current structure. It will improve a website rather than reorganizes it. Most complex web sites share aspects of all three types of information structures. Site hierarchy is created largely with standard navigational links within the site, but topical links embedded within the content create a web like mesh of associative links that transcends the usual navigation and site structure. Except in sites that rigorously enforce a sequence of pages, users are likely to traverse your site in a free-form web like manner, jumping across regions in the information architecture, just as they would skip through chapters in a reference book. So, it will be the clearer and more concrete our site organization is, the easier it is for users to jump freely from place to place without feeling lost (fig. 7).
Fig.8 Optimized path 
We will structure sites as hierarchies, but users seldom use them that way. A clear information structure allows the user to move freely and confidently through our site.
 Min Chen and Young U. Ryu,” Facilitating Effective User Navigation through Website Structure Improvement”, IEEE Transactions on Knowledge and Data Engineering, Vol. 25, No. 3, March 2013.
 G. N. Shinde and Inamdar S.A.,” Web Data Mining Using An Intelligent Information System Design”, G. N. Shinde,Inamdar S.A, Int. J. Comp. Tech. Appl., Vol 2 (2), 280-283.
 Patric J Lynch and Sarah Horton, “Website Style Guide 3rd Edition”
 J. B. MacQueen (1967): “Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability”, Berkeley, University of California Press, 1:281-297
 Brian T. Luke: “K-Means Clustering”
Cite This Work
To export a reference to this article please select a referencing style below: