CAB431 Text Analysis and Web Search


To view more information for this unit, select Unit Outline from the list below. Please note the teaching period for which the Unit Outline is relevant.


Unit Outline: Semester 1 2024, Gardens Point, Internal

Unit code:CAB431
Credit points:12
Pre-requisite:CAB201 or ITD121
Coordinator:Yuefeng Li | y2.li@qut.edu.au
Disclaimer - Offer of some units is subject to viability, and information in these Unit Outlines is subject to change prior to commencement of the teaching period.

Overview

With the explosion of information resources on the Web, social media and corporate intranets, there is an imminent need for advanced technologies to help people deal with big text data. There are many practical applications of Web search and text analysis in the areas such as classification of news stories, academic papers or medical records; spam or junk email filtering, understand customers opinion or behaviors through their feedback in online-systems or social media, customer service promotion etc. Therefore, it is urgent for IT developers, Web analysts, information management consultants, or Web development & support officers to understand NLP (Natural Language Processing) techniques, popular text processing models (such as Web search engine, information retrieval models); advanced text mining techniques (such as supervised methods for information filtering or classification and unsupervised method for topic modelling); and future directions in Web Intelligence.

Learning Outcomes

On successful completion of this unit you will be able to:

  1. Understand, write, and explain fundamental Web search model, theories, techniques and algorithms; (GC1)
  2. Design Web search solutions for user information needs; (GC1, GC2)
  3. Demonstrate knowledge of advanced text analysis techniques for information filtering, text classification, topic modelling for text feature selection. (GC1, GC4)
  4. Demonstrate knowledge of the principles and techniques of evaluating text analysis systems performance. (GC1,GC3)
  5. Work independently or in a team to implement a major text analysis project. (GC2, GC3, GC5)

Content

The unit content covers both the theory and the practice of Web search development and text analysis research. It also discusses two important ICT problem-solving approaches: search-based and artificial intelligence (AI)-based approaches. Topics covered in the unit include information retrieval models, techniques and algorithms, text processing, supervised learning from user feedback to understand user information needs for information filtering and classification, un-supervise learning method for topic modelling and/or text feature selection, and evaluation of Web search and text mining systems.

Learning Approaches

This unit has three contact hours per week organised in lectures and practical activities. Students are expected to actively participate in the lectures and take part in the practical activities. Lecture notes and practical questions will be made available weekly through CAB431 Canvas.

Feedback on Learning and Assessment

You can obtain feedback on your progress throughout the unit through the following mechanisms: · ask the teaching team for questions during the lectures or via email to answer questions; · You will receive a detailed marking criteria sheet for each assignment; · The unit coordinator or tutor will be available during consultation hours to provide constructive feedback on assessments upon completion.

Assessment

Overview

Criterion-Referenced Assessment Appropriate assessment criteria will be made available to students in the introduction to the assignments.

Unit Grading Scheme

7- point scale

Assessment Tasks

Assessment: Portfolio

A portfolio of work completed during the semester, including both practical programming exercises and comments or posting of contributions to theoretical topics or questions posed by lecturers.

This is an assignment for the purposes of an extension.

Weight: 20
Individual/Group: Individual
Due (indicative): Weekly
Related Unit learning outcomes: 1, 2, 3

Assessment: Project (applied)

A Major text analysis project which maybe undertaken in a team of 2 or 3.

This is an assignment for the purposes of an extension.

Weight: 35
Individual/Group: Individual and group
Due (indicative): Week 13
Related Unit learning outcomes: 1, 2, 3, 4, 5

Assessment: Examination (written)

A written examination, which covers the material presented in the lectures throughout the semester.

Weight: 45
Individual/Group: Individual
Due (indicative): End of Semester
Related Unit learning outcomes: 1, 2, 3, 4

Academic Integrity

Students are expected to engage in learning and assessment at QUT with honesty, transparency and fairness. Maintaining academic integrity means upholding these principles and demonstrating valuable professional capabilities based on ethical foundations.

Failure to maintain academic integrity can take many forms. It includes cheating in examinations, plagiarism, self-plagiarism, collusion, and submitting an assessment item completed by another person (e.g. contract cheating). It can also include providing your assessment to another entity, such as to a person or website.

You are encouraged to make use of QUT’s learning support services, resources and tools to assure the academic integrity of your assessment. This includes the use of text matching software that may be available to assist with self-assessing your academic integrity as part of the assessment submission process.

Further details of QUT’s approach to academic integrity are outlined in the Academic integrity policy and the Student Code of Conduct. Breaching QUT’s Academic integrity policy is regarded as student misconduct and can lead to the imposition of penalties ranging from a grade reduction to exclusion from QUT.

Resources

Suggested Textbook(s):
Search Engines: Information Retrieval in Practice. W. Bruce Croft, Donald Metzler, Trevor Strohman. 2010. Also freely available at http://ciir.cs.umass.edu/irbook/

Risk Assessment Statement

There are no unusual health or safety risks associated with this unit.