|
Article Excerpt In this article we describe the initial investigations that we have conducted on student data collected from a web-based tutoring tool. We have used some data mining techniques such as association rule and symbolic data analysis, as well as traditional SQL queries to gain further insight on the students' learning and deduce information to improve teaching. In our work, applying data mining facilities serves two purposes: (a) understand better both how students grasp the tool and assimilate the knowledge they need to learn and (b) get pedagogically relevant information that may influence or help improve teaching.
**********
With the emergence of e-learning, flexible education, and the increasing number of students in some fields, online teaching tools are becoming more and more important. Online teaching tools provide a more or less personalised environment where learners can learn at their own pace, have access to tutorial lessons, practice exercises, be given explanations and feedback on their performance, and so on. These benefits to the learners are extremely valuable and we assist to "a quiet revolution taking place in the classrooms" (Forster, 2002). However, less attention has been given to the reflection and monitoring that can be made to improve the teaching. Since online teaching tools are computer-based, they allow storing complete student answers, including mistakes made while solving exercises. The fact that they are online tools means that all this information, for all students using the tool, can be stored on a common server rather than stored locally. Having electronic access to complete student answers makes it possible to extract pedagogically relevant information and provide feedback to the teacher about how a class, a group of students or an individual student is going. It also makes it possible to get more insight on how students get along with the tool and the content.
The Logic-ITA is a web-based Intelligent Teaching Assistant system that is currently used within the School of Information Technologies, University of Sydney, for an undergraduate course on formal languages and logic. Its aim is to facilitate the whole teaching and learning process by helping the teacher as well as the learner. It allows students to practice formal proofs in propositional logic while receiving feedback and also keeps the lecturer informed about the progress the class is making and problems encountered. The system embeds the Logic Tutor, a web-based intelligent tutoring system destined to the students that stores their complete work including mistakes, along with tools dedicated to the teacher for managing teaching configuration settings and material, as well as collecting and analysing data. A multimedia article on the Logic Tutor can be found in Abraham, Crawford, Lesta, Merceron, and Yacef (2001) and a description of the Logic-ITA in Lesta and Yacef (2002). We are now extending the capabilities of the system to provide more information and more intelligent help to the teacher. First results in this direction can be found in Merceron and Yacef (2003).
In this article we investigate the impact that the analysis of the data collected in such a tool can have on the whole process of teaching and learning. We use data mining techniques on the data stored by the Logic-ITA to better investigate the impact on learning and to improve teaching. More precisely, symbolic data analysis allows us to gain further insight into the students' learning and associations, rules of items that we apply to mistakes opens new perspectives to improve teaching. With e-learning, complete student answers will be more and more available in electronic format. This work shows some possibilities of what can be done with them.
There is an increasing interest in providing assistance to the human teacher and to integrate him or her formally into the loop (Jean, 2000; Kinshuk, Patel, Oppermann, & Russell, 2001; Kinshuk, Hong, & Patel, 2001; Leroux, Vivet, & Brezillon, 1996; Virvou & Moundridou, 2001; Vivet, 1992; Yacef, 2002) and this is supported by the combining of computational intelligence with web-based education (Calvo & Grandbastien, 2003; Vasilakos, Devedzic, Kinshuk, & Pedrycz, 2004; Yacef, 2003). However, in terms of help in diagnosis and assessment of learning, analysis, and synthesis of results, not as much work has been done. Implicative statistical analysis (Gras et al., 1996; Gras, Briand, Peter, & Philippe, 1997) has been developed to extract information from data gathered among students. It is supported by the C.H.I.C. software. C.H.I.C. accepts standard data where students are described in a homogeneous way. This is not exactly what happens with the data we get from the Logic Tutor, since students do not necessarily attempt the same exercises, nor the same number of exercises. Jean's (2000) PepiDiag and PepiProfil system, like the Logic-ITA, collects data from students' exercises, reports them to the teacher, and provides tools to analyse these results. One main difference resides in the fact that it processes one student's data at a time, whereas the Logic-ITA combines data from all students. The tool OASIS (Smail & Hussmann, 2003) bears similarity with the Logic-ITA in the sense that it stores complete answers of students, including wrong answers, and provides extensive statistics. However, it does not have a tutoring facility and provides only a yes/no answer to students when they enter a result for an exercise. No mistake diagnosis is provided.
In this article we describe the investigation that we have conducted on the data collected from such a system, and how this data is used to gain further insight on the students' learning and how it can impact the teaching. The article is organized as follows. First we will present the student data manipulated and stored in the Logic-ITA both inside the student side, the Logic Tutor, which allows students to practice logical proofs, and inside the teacher's side that structures all answers entered by all students for the teacher. Then we will explain the impact of mining the data in the Logic-ITA from a learning aspect, looking at the correlation between exam performance and Logic Tutor activity as well as using symbolic data analysis. We consequently describe the impact of mining the data in the Logic-ITA from a teaching perspective, by extracting pedagogically relevant information through SQL queries and association of mistakes. We then conclude the article.
STUDENT DATA IN THE LOGIC ITA
From the Student's Side: The Logic Tutor
The Logic Tutor is an online Intelligent Tutoring System (Abraham et al., 2001) allowing students to build formal proofs in propositional logic while receiving step-by-step, contextualised feedback. It uses a conventional interface allowing forward and sequential construction of proofs, as opposed to other computerised educational systems for this domain such as Croy (1989, 1999) and Scheines and Sieg (1994). We do not hold any particular argument for the style of interface. We just kept the same interface as the one used in previous years without the Logic Tutor. Our aim was to design an Intelligent Teaching Assistant system and then evaluate its usefulness using the previous year as a control group. It would have been more difficult to interpret results if we had changed the style of the interface.
We will describe the relevant data that the Logic Tutor stored in each student model in the context of an exercise, since this is the input data to the mining methods that we will describe in the following sections.
Exercises start with a given set of premises, that is, a set of well-formed formulae (wff) of propositional logic, and exactly one wff, the conclusion. The task then consists of deriving the conclusion from the premises, step-by-step, using laws of equivalence and rules of inference (we will refer to both of these as rules for the rest of this article). Figure 1 shows a screen shot of the interface. Here the student was given the first two lines (lines and 1) and the conclusion at the bottom left corner, that is "C." For each step, the student must fill out a new line, entered at the bottom of the screen. The student needs to do the following:
1. enter a formula in the Formula section;
2. choose, from a pop-up menu, the rule used to derive this formula from one or more previous line(s) (Rules);
3. the references of those previous lines (Line References); and
4. the premises the formula relies on (Premises).
For example in Figure 1, the student is currently deriving the formula "C," using the rule "Indirect Proof" using the formulae of lines 2 and 7. Because lines 2 and 7 rely respectively on premises {2} and {0,1,2} (as can be seen in the first column of the screen) and Indirect proof removes the premise 2, the line entered therefore relies on premises {0,1}. It is actually the last step of this exercise, deriving the conclusion.
There are often many ways to prove an argument valid. The important aspect is that the reasoning must be sound. The actual path followed is not important, as long as each step is valid. In this regard, our approach is less sophisticated than one such as Model-Tracing (Anderson, Corbett, Koedinger, & Pelletier, 1995), which identifies the student's reasoning by checking his/her answers against predetermined solutions. The Logic Tutor instead assesses...
|