CAB431 Text Analysis and Web Search
To view more information for this unit, select Unit Outline from the list below. Please note the teaching period for which the Unit Outline is relevant.
Unit code: | CAB431 |
---|---|
Prerequisite(s): | CAB201 or ITD121 |
Credit points: | 12 |
Timetable | Details in HiQ, if available |
Availabilities |
|
CSP student contribution | $1,164 |
Domestic tuition unit fee | $4,356 |
International unit fee | $5,172 |
Unit Outline: Semester 1 2025, Gardens Point, Internal
Unit code: | CAB431 |
---|---|
Credit points: | 12 |
Pre-requisite: | CAB201 or ITD121or CAB220 or DSB100 or CAB202 or EGB103 |
Coordinator: | Yuefeng Li | y2.li@qut.edu.au |
Overview
With the explosion of information resources on the Web, social media and corporate intranets, there is an imminent need for advanced technologies to help people deal with big text data. There are many practical applications of Web search and text analysis in the areas such as classification of news stories, academic papers or medical records; spam or junk email filtering, understand customers opinion or behaviors through their feedback in online-systems or social media, customer service promotion etc. Therefore, it is urgent for IT developers, Web analysts, information management consultants, or Web development & support officers to understand NLP (Natural Language Processing) techniques, popular text processing models (such as Web search engine, information retrieval models); advanced text mining techniques (such as supervised methods for information filtering or classification and unsupervised method for topic modelling); and future directions in Web Intelligence.
Learning Outcomes
On successful completion of this unit you will be able to:
- Understand, write, and explain fundamental Web search model, theories, techniques and algorithms; (GC1)
- Design Web search solutions for user information needs; (GC1, GC2)
- Demonstrate knowledge of advanced text analysis techniques for information filtering, text classification, topic modelling for text feature selection. (GC1, GC4)
- Demonstrate knowledge of the principles and techniques of evaluating text analysis systems performance. (GC1,GC3)
- Work independently or in a team to implement a major text analysis project. (GC2, GC3, GC5)
Content
The unit content covers both the theory and the practice of Web search development and text analysis research. It also discusses two important ICT problem-solving approaches: search-based and artificial intelligence (AI)-based approaches. Topics covered in the unit include information retrieval models, techniques and algorithms, text processing, supervised learning from user feedback to understand user information needs for information filtering and classification, un-supervise learning method for topic modelling and/or text feature selection, and evaluation of Web search and text mining systems.
Learning Approaches
This unit has three contact hours per week organised in lectures and practical activities. Students are expected to actively participate in the lectures and take part in the practical activities. Lecture notes and practical questions will be made available weekly through CAB431 Canvas.
Feedback on Learning and Assessment
You can obtain feedback on your progress throughout the unit through the following mechanisms: · ask the teaching team for questions during the lectures or via email to answer questions; · You will receive a detailed marking criteria sheet for each assignment; · The unit coordinator or tutor will be available during consultation hours to provide constructive feedback on assessments upon completion.
Assessment
Overview
Criterion-Referenced Assessment Appropriate assessment criteria will be made available to students in the introduction to the assignments.
Unit Grading Scheme
7- point scale
Assessment Tasks
Assessment: Portfolio
A portfolio of work completed during the semester, including both practical programming exercises and comments or posting of contributions to theoretical topics or questions posed by lecturers.
This is an assignment for the purposes of an extension.
Assessment: Project (applied)
A Major text analysis project which maybe undertaken in a team of 2 or 3.
This is an assignment for the purposes of an extension.
Assessment: Examination (written)
A written examination, which covers the material presented in the lectures throughout the semester.
Academic Integrity
Academic integrity is a commitment to undertaking academic work and assessment in a manner that is ethical, fair, honest, respectful and accountable.
The Academic Integrity Policy sets out the range of conduct that can be a failure to maintain the standards of academic integrity. This includes, cheating in exams, plagiarism, self-plagiarism, collusion and contract cheating. It also includes providing fraudulent or altered documentation in support of an academic concession application, for example an assignment extension or a deferred exam.
You are encouraged to make use of QUT’s learning support services, resources and tools to assure the academic integrity of your assessment. This includes the use of text matching software that may be available to assist with self-assessing your academic integrity as part of the assessment submission process.
Breaching QUT’s Academic Integrity Policy or engaging in conduct that may defeat or compromise the purpose of assessment can lead to a finding of student misconduct (Code of Conduct – Student) and result in the imposition of penalties under the Management of Student Misconduct Policy, ranging from a grade reduction to exclusion from QUT.
Resources
Suggested Textbook(s):
Search Engines: Information Retrieval in Practice. W. Bruce Croft, Donald Metzler, Trevor Strohman. 2010. Also freely available at http://ciir.cs.umass.edu/irbook/
Risk Assessment Statement
There are no unusual health or safety risks associated with this unit.
Unit Outline: Semester 1 2025, Online
Unit code: | CAB431 |
---|---|
Credit points: | 12 |
Pre-requisite: | CAB201 or ITD121or CAB220 or DSB100 or CAB202 or EGB103 |
Overview
With the explosion of information resources on the Web, social media and corporate intranets, there is an imminent need for advanced technologies to help people deal with big text data. There are many practical applications of Web search and text analysis in the areas such as classification of news stories, academic papers or medical records; spam or junk email filtering, understand customers opinion or behaviors through their feedback in online-systems or social media, customer service promotion etc. Therefore, it is urgent for IT developers, Web analysts, information management consultants, or Web development & support officers to understand NLP (Natural Language Processing) techniques, popular text processing models (such as Web search engine, information retrieval models); advanced text mining techniques (such as supervised methods for information filtering or classification and unsupervised method for topic modelling); and future directions in Web Intelligence.
Learning Outcomes
On successful completion of this unit you will be able to:
- Understand, write, and explain fundamental Web search model, theories, techniques and algorithms; (GC1)
- Design Web search solutions for user information needs; (GC1, GC2)
- Demonstrate knowledge of advanced text analysis techniques for information filtering, text classification, topic modelling for text feature selection. (GC1, GC4)
- Demonstrate knowledge of the principles and techniques of evaluating text analysis systems performance. (GC1,GC3)
- Work independently or in a team to implement a major text analysis project. (GC2, GC3, GC5)
Content
The unit content covers both the theory and the practice of Web search development and text analysis research. It also discusses two important ICT problem-solving approaches: search-based and artificial intelligence (AI)-based approaches. Topics covered in the unit include information retrieval models, techniques and algorithms, text processing, supervised learning from user feedback to understand user information needs for information filtering and classification, un-supervise learning method for topic modelling and/or text feature selection, and evaluation of Web search and text mining systems.
Learning Approaches
This unit has three contact hours per week organised in lectures and practical activities. Students are expected to actively participate in the lectures and take part in the practical activities. Lecture notes and practical questions will be made available weekly through CAB431 Canvas.
Feedback on Learning and Assessment
You can obtain feedback on your progress throughout the unit through the following mechanisms: · ask the teaching team for questions during the lectures or via email to answer questions; · You will receive a detailed marking criteria sheet for each assignment; · The unit coordinator or tutor will be available during consultation hours to provide constructive feedback on assessments upon completion.
Assessment
Overview
Criterion-Referenced Assessment Appropriate assessment criteria will be made available to students in the introduction to the assignments.
Unit Grading Scheme
7- point scale
Assessment Tasks
Assessment: Portfolio
A portfolio of work completed during the semester, including both practical programming exercises and comments or posting of contributions to theoretical topics or questions posed by lecturers.
This is an assignment for the purposes of an extension.
Assessment: Project (applied)
A Major text analysis project which maybe undertaken in a team of 2 or 3.
This is an assignment for the purposes of an extension.
Assessment: Examination (written)
A written examination, which covers the material presented in the lectures throughout the semester.
Academic Integrity
Academic integrity is a commitment to undertaking academic work and assessment in a manner that is ethical, fair, honest, respectful and accountable.
The Academic Integrity Policy sets out the range of conduct that can be a failure to maintain the standards of academic integrity. This includes, cheating in exams, plagiarism, self-plagiarism, collusion and contract cheating. It also includes providing fraudulent or altered documentation in support of an academic concession application, for example an assignment extension or a deferred exam.
You are encouraged to make use of QUT’s learning support services, resources and tools to assure the academic integrity of your assessment. This includes the use of text matching software that may be available to assist with self-assessing your academic integrity as part of the assessment submission process.
Breaching QUT’s Academic Integrity Policy or engaging in conduct that may defeat or compromise the purpose of assessment can lead to a finding of student misconduct (Code of Conduct – Student) and result in the imposition of penalties under the Management of Student Misconduct Policy, ranging from a grade reduction to exclusion from QUT.
Resources
Suggested Textbook(s):
Search Engines: Information Retrieval in Practice. W. Bruce Croft, Donald Metzler, Trevor Strohman. 2010. Also freely available at http://ciir.cs.umass.edu/irbook/
Risk Assessment Statement
There are no unusual health or safety risks associated with this unit.