Personalised information access systems use historical feedback data, such as implicit and explicit ratings for textual documents and other items, to better locate the right or relevant information for individual users.
Three topics in personalised information access are addressed: learning from relevance feedback and document categorisation by the use of concept-based text representations, the need for scalable and accurate algorithms for collaborative filtering, and the integration of textual and collaborative information access.
Two concept-based representations are investigated that both map a sparse high-dimensional term space to a dense concept space. For learning from relevance feedback, it is found that the representation combined with the proposed learning algorithm can improve the results of novel queries, when queries are more elaborate than a few terms. For document categorisation, the representation is found useful as a complement to a traditional word-based one.
For collaborative filtering, two algorithms are proposed: the first for the case where there are a large number of users and items, and the second for use in a mobile device. It is demonstrated that memory-based collaborative filtering can be more efficiently implemented using inverted files, with equal or better accuracy, and that there is little reason to use the traditional in-memory vector approach when the data is sparse. An empirical evaluation of the algorithm for collaborative filtering on mobile devices show that it can generate accurate predictions at a high speed using a small amount of resources.
For integration, a system architecture is proposed where various combinations of content-based and collaborative filtering can be implemented. The architecture is general in the sense that it provides an abstract representation of documents and user profiles, and provides a mechanism for incorporating new retrieval and filtering algorithms at any time.
In conclusion this thesis demonstrates that information access systems can be personalised using scalable and accurate algorithms and representations for the increased benefit of the user.
Kista: Institutionen för data- och systemvetenskap (tills m KTH) , 2005. , 126 p.