A Study of Fill-in-blank Style Problems for Java and C Programming Learning Assistant System

(1)

A Study of Fill-in-blank Style Problems for Java and C Programming Learning Assistant System

September, 2021

Htoo Htoo Sandi Kyaw

Graduate School of

Natural Science and Technology

(Doctor’s Course)

(2)

(3)

Dissertation submitted to

Graduate School of Natural Science and Technology of

Okayama University for

partial fulﬁllment of the requirements for the degree of

Doctor of Philosophy in Engineering.

Written under the supervision of Professor Nobuo Funabiki

and co-supervised by Professor Satoshi Denno

and

Professor Yasuyuki Nogami

(4)

(5)

T o W hom I t M ay C oncern

We hereby certify that this is a typical copy of the original doctor thesis of Ms. Htoo Htoo Sandi Kyaw

Signature of Seal of

the Supervisor

Graduate School of

(6)

(7)

Abstract

Nowadays, computers have been broadly used to control, manage, support a variety of services, infrastructures, institutions, and systems in our societies due to the flexibility of the offering functions. This flexibility comes from the programming codes that have been developed for computers.

Programming codes can realize a versatile functionality at a computer. The quality of the programming codes determines the eﬀectiveness of the computer. Therefore, the eﬃcient education of computer programming using programming languages has become the very important issue for a lot of universities and professional schools, particularly, in information technology (IT) or its related departments.

As the reliable and portable object-oriented programming language, Java programming has been extensively used to implement a variety of practical systems. A large number of universities and professional schools are offering Java programming courses to meet these needs. To assist Java programming educations in schools, we have developed the Web-basedJava Programming Learning Assistant System (JPLAS). JPLAS offers various types of exercise problems at different levels such that the correctness of any answer from a student is automatically verified.

As one exercise problem, JPLAS provides theelement fill-in-blank problem (EFP)to support self-studies of novice students to learn the Java grammar and basic programming through code reading. In the EFP instance, a source code with several blank elements is shown to a student, where he/she needs to fill in the blanks. The answer is marked through string matchingwith the correct one. Anelementis defined as the least unit of a code, such as a reserved word, an identifier, and a control symbol. To generate feasible EFP instances from source codes automatically, the graph-based blank element selection algorithm has been proposed to select the blank elements that offer only the unique correct answers.

However, application results of the EFP instances to students have found that students can solve them mechanically without reading source codes carefully and understanding their behaviors correctly, although they were useful for novice students. The correct answer choice is usually limited for each blank if it follows the grammar. Therefore, in this thesis, I present the study of ﬁll-in-blank style exercise problems with ﬁve contributions on improvements or advancements of EFP in JPLAS.

In the ﬁrst contribution of the thesis, I propose the code completion problem (CCP) as the harder exercise problem in JPLAS. To generate the CCP instance, the elements selected by the blank element selection algorithmare simply removed from the source code to be given to students.

Thus, in contrast to EFP, the CCP instance does not show where the missing elements exist in the source code, and asks students to complete each statement by filling in the missing elements at the proper positions. Whenstring matchingfinds that the whole statement is equal to the original one, it is regarded as the correct answer. For evaluations, I generated CCP instances and applied them to several university students. The results confirmed that CCP is harder than EFP.

In the second contribution, I propose the element ﬁxing problem (EXP) and the code ﬁxing

(8)

problem (CXP)in JPLAS. To generate the EXP or CXP instance, the elements selected by theblank element selection algorithmare replaced by the similar but incorrect elements that will be found by the error injection algorithm. In the answer interface of JPLAS, the input form is prepared for each incorrect element for EXP, whereas it is prepared for the whole statement containing an incorrect element for CXP. For evaluations, I generated EXP and CXP instances and applied them to university students. The results show that CXP is harder than EXP.

In the third contribution, I propose thecode amendment problem (CAP)in JPLAS as a mixture of CCP and CXP. To generate the CAP instance, either ofblankorerroris randomly selected with the given blank probability BP for each element that is selected by the blank element selection algorithm. For evaluations, I generated CAP instances with diﬀerent BP values and assigned them to students. The results show thatBP=50% gives the most diﬃcult CAP, and CAP is harder than CCP and CXP.

In the fourth contribution, I study the EFP in C Programming Learning Assistant System (CPLAS)by extending the works for JPLAS.C programminghas been considered to be the most fundamental programming language, and has been educated at early stages of learning programming courses in a lot of schools. To automatically generate an EFP instance for CPLAS, the graph-based blank element selection algorithm is newly designed and implemented for C pro- gramming. For evaluations, I generated EFP instances for CPLAS using source codes for basic grammar concepts and fundamental data structure and algorithms, and assigned them to students.

The results confirm the effectiveness of EFP in detecting the students who have difficulty in study- ingC programmingand the hard topics for them.

In the last contribution, I study the CCP in CPLAS by extending the works for CCP in JPLAS and for EFP in CPLAS. For evaluations, I generated CCP instances, and assigned them to students.

The results conﬁrm the eﬀectiveness of CCP in self-study ofC programmingby novice students.

To improve self-study environments ofC programmingby students, a lot of exercise problems should be oﬀered in CPLAS. Thus, in future studies, I will extend other types of exercise problems in JPLAS to those in CPLAS. Besides, other programming languages such as Python and JavaScript have become important for students to study. Thus, I will also extend theblank element selection algorithmfor these programming languages to help students learn the grammar and basic programming. Then, I will continue to assign generated problems to students in programming courses.

ii

(9)

(10)

Acknowledgements

It is my great pleasure to express my heartiest thankfulness to those who gave me the valuable time and supported me in making this dissertation possible. I believe that you are the greatest blessing in my life. Thanks all of you for supporting and guiding me.

I acknowledge my deepest sense of gratitude to my honorable supervisor, Professor Nobuo Funabiki for his excellent supervision, meaningful suggestions, persistent encouragements, and other fruitful help during each stage of my Ph.D. study. His thoughtful comments and guidance helped me to complete my research papers and present them in productive ways. Besides, he was always patient and helpful whenever his guidance and assistance were needed in both of my academic and daily life in Japan. I have really been lucky in working with a person like him.

Needless to say, it would not have been possible to complete this thesis without his guidance and active support.

I am indebted to my Ph.D. co-supervisors, Professor Satoshi Denno and Professor Yasuyuki Nogami, for taking the valuable time to give me advice, guidance, insightful comments, and proof- reading of this thesis.

I want to express my profound gratitude to Associate Professor Minoru Kuribayashi in Okayama University for his valuable discussions during my research. I would like to take this opportunity to thank and convey my respect to all of my course teachers during my Ph.D. study for sharing their great ideas and knowledge with me.

I would like to acknowledge the Ministry of Education, Culture, Sports, Science, and Technol- ogy of Japan (MEXT) for ﬁnancially supporting my Ph.D. study, and Dr. Aung Win and Professor Hnin Aye Thant from University of Technology (Yatanarpon Cyber City) for supporting documents for my Ph.D study.

I would like to thank for the fruitful discussions and cooperations with many people including Dr. Khin Khin Zaw and Dr. Nobuya Ishihara. I would like to convey my respect to all the members of FUNABIKI Lab for their support during the period of this study.

Last but not least, I am grateful to my beloved parents, sister and brother who always encourage and support me througout my life. For your unconditional love, supports, patience, and conﬁdence in me are the biggest rewards as well as the driving forces of my life.

Htoo Htoo Sandi Kyaw Okayama University, Japan September 2021

iv

(11)

(12)

List of Publications

Journal Paper

1. Htoo Htoo Sandi Kyaw, Su Sandy Wint, Nobuo Funabiki, and Wen-Chung Kao, “A code completion problem in Java programming learning assistant system,” IAENG International Journal of Computer Science (IJCS), vol. 47, no. 3, pp. 350-359, Sept 2020.

2. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, and Wen-Chung Kao, “A proposal of code amendment problem in Java programming learning assistant system,” International Journal of Information and Education Technology (IJIET), vol. 10, No. 10, pp. 751-756, October 2020.

3. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, Shune Lae Aung, Nem Khan Dim, and Wen- Chung Kao, “A study of element ﬁll-in-blank problems for C programming learning assistant system,” International Journal of Information and Education Technology (IJIET), vol. 11, no. 6, pp. 255-261, 2021.

International Conference Paper

4. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, Nobuya Ishihara, Minoru Kuribayashi, and Wen- Chung Kao, “Web-server implementation of code completion problem for Java programming learning assistant system,” in Proceeding of the 2019 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW 2019), May 20-22, 2019.

5. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, and Minoru Kuribayashi, “An implementation of oﬄine answering function for code completion problem in C programming learning assistant system,” in proceeding of the IEEE 3rd Global Conference on Life Sciences and Technologies, pp. 162-165, March 2021.

6. Htoo Htoo Sandi Kyaw, Ei Ei Htet, Nobuo Funabiki, Minoru Kuribayashi, Thunder Myint, Phyu Phyu Tar, Nandar Win Min, Hnin Aye Thant, and Phyu Hnin Wai, “A code completion problem in C programming learning assistant system,” in proceeding of the 9th International Conference on Information and Education Technology (ICIET), pp. 34-40, March 2021.

7. Nobuo Funabiki, Htoo Htoo Sandi Kyaw, Khin Khin Zaw, “A Proposal of Element/Code Fixing Problem in Java Programming Learning Assistant System,” in proceeding of the In- ternational Conference of Science and Engineering (ICSE).

vi

(13)

Other Papers

8. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, Minoru Kuribayashi, and Khin Khin Zaw, “Three improvements for code completion problem in Java programming learning assistant system,”

in proceeding of the IPSJ Technical Report, 2018-4-13, Jan. 2019.

9. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, Nobuya Ishihara, Minoru Kuribayashi, and Khin Khin Zaw, “Implementation of code completion problem in Online Java programming learning assistant system,” in proceeding of the IEICE General Conference, pp. 93-94, Mar. 2019.

10. Htoo Htoo Sandi Kyaw, Nobuo Funabiki and Minoru Kuribayashi, “An Implementation of Hint Function for Code Completion Problem in Java Programming Learning Assistant System,” in proceeding of the FIT Conference, pp. 307-308, Sept. 2019.

11. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, and Minoru Kuribayashi, “A study of element ﬁll-in-blank problem with blank element selection algorithm for C programming learning assistant system,” in proceeding of the IEICE Technical Report, ET2020-14, pp. 23-28, Sept 2020.

12. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, Minoru Kuribayashi, Thandar Myint, and Hnin Aye Thant, “A Study of Code Completion Problem for C Programming Learning Assistant System,” in proceeding of the IEICE Technical Report, LOIS, pp. 69-74.

13. Htoo Htoo Sandi Kyaw, Nobuo Funabiki, Thandar Myint, Phyu Phyu Tar, Nandar Win Min, Hnin Aye Thant, Shune Lae Aung, and Nem Khan Dim, “Applications of element ﬁll-in-blank problems for C programing learning in three universities,” FIT. Conference.

[submitted]

(14)

(15)

List of Figures

2.1 Software platform for JPLAS. . . 5

2.2 Operation ﬂow for oﬄine answering function. . . 6

2.3 MyMathclass source code. . . 8

2.4 MyMathclass test code. . . 9

4.1 FibonacciCalculatorclass source code. . . 15

4.2 FibonacciCalculatorclass source code after blank elements selection and removal. 16 4.3 JavaScriptfunction to allow tab in web browser. . . 18

4.4 Interface of problem answering in oﬄine JPLAS. . . 19

4.5 Correlation between diﬃculty level and correct rate. . . 24

4.6 Correlation between diﬃculty level and correct rate. . . 26

5.1 PalindromeExampleclass source code. . . 28

5.2 PalindromeExampleclass source code after the application of blank element selection algorithm. . . 29

5.3 PalindromeExampleclass generated problem code. . . 29

5.4 Answer interface for EXP. . . 30

5.5 Answer interface for CXP. . . 31

6.1 PalindromeExampleclass source code. . . 34

6.2 PalindromeExampleclass source code after application of blank element selection algorithm. . . 35

6.3 PalindromeExampleclass generated problem code. . . 35

6.4 Answer Interface for CAP. . . 36

7.1 Example EFP instance. . . 39

7.2 Example source code. . . 39

7.3 Interaction between lexical analyzer and the parser. . . 40

7.4 Example simple source code. . . 41

7.5 Constraint graph forsample methodin Figure 7.4. . . 44

7.6 Compatibility graph forsample methodin Figure 7.4. . . 45

8.1 Example source code. . . 52

8.2 Example problem code. . . 53

8.3 CCP answer interface. . . 54

(16)

List of Tables

1.1 Student solution result of JPLAS. . . 2

2.1 Files for distribution. . . 7

4.1 Markers in problem data ﬁeld. . . 20

4.2 Instances for ﬁrst-step evaluation. . . 21

4.3 Summary of correct solution rates for ﬁrst-step evaluation. . . 21

4.4 T-testresult for ﬁrst-step evaluation. . . 22

4.5 Summary of correct solution rates for second-step evaluation. . . 23

4.6 Instances for third-step evaluation. . . 24

4.7 Summary of correct solution rates for third-step evaluation. . . 24

4.8 Instances for fourth-step evaluation. . . 25

4.9 Summary of correct solution rates for fourth-step evaluation. . . 25

5.1 Correct solution rates for EXP and CXP (%). . . 32

5.2 Correct solution rates for EFP and CCP (%). . . 32

6.1 Correct solution rates for CAP (%). . . 36

6.2 Correct solution rates for CCP and CXP (%). . . 37

7.1 Symbol information . . . 40

7.2 EFP instances for basic grammar concepts. . . 46

7.3 Correct answer rate distribution for basic grammar concepts. . . 46

7.4 Submission times distribution for basic grammar concepts. . . 47

7.5 Solving result in each EFP instance for basic grammar concepts. . . 48

7.6 EFP instances for data structures and algorithms. . . 49

7.7 Correct answer rate distribution for data structures and algorithms. . . 49

7.8 Submission times distribution for data structures and algorithms. . . 49

7.9 Solving result in each EFP instance for data structures and algorithms. . . 50

8.1 Generated CCP instances. . . 55

8.2 Correct answer rate distribution. . . 55

8.3 Submission times distribution. . . 56

8.4 Individual CCP instance results. . . 57

x

(17)

(18)

Chapter 1 Introduction

1.1 Backgrounds

At present, computers offer flexible functions and as a result, they have been broadly used to control, manage, support a variety of services, infrastructures, institutions, and systems in our societies. This flexibility comes from the programming codes that have been developed for computers.

Programming codes can realize a versatile functionality at a computer. The quality of programming codes effects the ability of the computer. As a result, the effectiveness of computer programming education has become very important issue for a lot of universities and professional schools. And the popularity of programming is increasing day by day, many information technology (IT) or its related departments offers high quality education of computer programming.

As a reliable and portable object-oriented programming language, Javahas been extensively used variety of practical systems. They involve Web application systems, mission critical systems for large enterprises, and small-sized embedded systems. Thus, the cultivation of Java programming engineers has been in high demands amongst industries. As well, a great number of universities and professional schools are oﬀering Java programming courses to meet these needs. A Java programming course usually combines grammar instructions by classroom lectures and programming exercises by computer operations. However, in programming exercises, a teacher can be overloaded in veriﬁcations of a lot of codes from students and in giving feedbacks with proper comments to them. If responses from the teacher becomes late, students may lose the learning motivations.

To help Java programming educations in schools, we have developed the Web-basedJava pro- gramming learning assistant system (JPLAS). JPLAS provides various types of exercise problems such that any student answer can be automatically marked in the system, to help self-studies of Java programming at various learning stages by students. The exercise problems include theelement ﬁll-in-blank problem (EFP), the value trace problem (VTP), the statement ﬁll-in-blank problem (SFP), and thecode writing problem (CWP)to support self-studies. It has been expected that this system be a great help for reducing the teacher loads and improving the learning motivations of the students.

JPLAS adopts theLinuxfor the operating system,Tomcatfor the Web application server,JSP/- Javafor the application programs withHTML, andMySQLfor the database for managing the data.

The user can access to JPLAS through a Web browser. JPLAS also oﬀers theoﬄine answering functionto allow students to answer the problems in JPLAS even if the Internet is unavailable [1].

To solve exercise problems in JPLAS using theoﬄine JPLAS, the problem assignment deliveries and answer submissions can be accomplished with USB memories.

(24)

Table 1.1: Student solution result of JPLAS.

problem # of assignments AVE SD

EFP 22 11.62 (53%) 5.68

VTP 19 8.04 (42%) 6.91

SFP 9 0.66 (7%) 0.14

CWP 1 0.02 (2%) 0.14

Among the exercise problems, the element ﬁll-in-blank problem (EFP)in JPLAS intends for a student to learn the Java grammar and basic programming through code reading. This problem asks a student to ﬁll the correct elements in the blank in a given Java code. The correctness of the answer is marked by comparing it with the original element in the code. Thus, the original element must be the unique correct answer for the blank. To help a teacher to generate the feasible EFP instance, thegraph-based blank element selection algorithmhas been proposed to select the proper blank elements that satisfy the unique correct answers [2].

In EFP, anelementis defined as the least unit of a code such as a reserved word, an identifier, and a control symbol. Areserved word is a fixed sequence of characters that has been defined in Java grammar to represent a specified function and should be mastered first by the students. An identifieris a sequence of characters defined in the code by the author to represent a variable, a class, or a method. A control symbol intends other grammar elements such as . (dot), : (colon), ; (semicolon), (, ) (bracket), {,} (curly bracket).

On the other hand, the code writing problem (CWP)in JPLAS intends for a student to learn writing a Java source code from scratch. This problem asks a student to write a source code that satisﬁes the speciﬁcations given in thetest code. In CWP, we adopted thetest-driven development (TDD) methodusingJUnit[3][4][5]. JUnitautomatically tests the answer code via thetest code on the server, to verify its correctness when submitted from a student. A student writes a source code through reading the given test code in JPLAS, and testing, modifying, and resubmitting the code if error occurs.

However, application results of the EFP instances to students have found that students can solve them mechanically without reading source codes carefully and understanding their behaviors correctly, although they were useful for novice students. The correct answer choice is usually limited for each blank, if it follows the grammar.

Table 1.1 shows the average (AVE) and the standard deviation (SD) of the number of correct answers in the four exercise problems in JPLAS by students in Okayama University. A big gap was observed in the solution performances between EFP and CWP. It indicates that EFP is not suﬃcient in improving the Java programming skill.

C programminghas been considered to be the most fundamental programming language, and has been educated at early stages of learning programming courses in a lot of schools. JPLAS provides exercise problems for only Java language. AsC programming is taught to students at their early stage of learning programming in a lot of schools, the works forJava programming should be extended toC programming.

2

(25)

1.2 Contributions

Motivated by the above-mentioned problems, in this thesis, I propose the ﬁve improvements or advancements of theelement ﬁll-in-blank problem (EFP)in JPLAS.

In the first contribution of the thesis, I propose thecode completion problem (CCP)[6] to over- come the drawback of the element fill-in-blank problem (EFP). In contrast to EFP, CCP does not show where the missing elements exist. Then, it will ask students to complete each statement by filling in the correct elements at the ideal positions. When the whole statement becomes equal to the original instring matching, it is regarded as the correct answer. For evaluations, I implemented the CCP instance generation program by modifying the program and the answering/marking interface for EFP in both online and offline JPLAS [7]. Then, I generated six CCP instances using highly readable source codes [8] that were obtained by applying thecoding rule check function[9]

to the original codes, and asked 20 university students to solve them with the answer interface. The results conﬁrmed that CCP is harder than EFP, and thetwo-level markingand the hint function are eﬀective in improving solution performances of students for CCP.

In the second contribution, I propose the two types of Java programming problems, namely, the element fixing problem (EXP)and thecode fixing problem (CXP)because the correction or fixing of incorrect or mistyped elements in a code is important for novice students in Java programming studies. To generate an EXP or CXP instance, the elements selected by theblank element selection algorithm[2] are replaced by the similar but incorrect ones that will be found by theerror injection algorithm. This algorithm will be extended from theerror generation algorithm[25]. In the answer interface of JPLAS, the input form is prepared for each incorrect element for EXP, whereas it is prepared for the whole statement containing an incorrect element for CXP. Thus, CXP is supposed to be harder than EXP.

In the third contribution, I propose the Code Amendment Problem (CAP) in JPLAS to offer feasible exercises on code debugging process. As a mixture of CCP and CXP instances, a CAP instance is generated by randomly selecting eitherblankorerrorfor each element that is selected by theblank element selection algorithm. For a generated CAP instance, thedifficulty levelcan be controlled by adjusting the probability of selecting blanks, which is specified as theblank proba- bility (BP). For evaluations, I generated 12 CAP instances using source codes in [11] and [12] with 30%, 50%, and 70% for BP, and assigned them to 21 students. The results show thatBP= 50%

oﬀers the highest diﬃculty level among them, and CAP is harder than CCP and CXP.

In the fourth contribution, I propose theelement fill-in-blank problem (EFP)as the first step of C Programming Learning Assistant System (CPLAS)by extending the works in JPLAS. To automatically generate an EFP instance for C, theblank element selection algorithmis newly designed and implemented forC programmingwhere the conditions for selecting blank elements are rede- fined. This algorithm consists of the four steps: 1) it analyzes the C source code using the original parser, 2) it generates aconstraint graphby selecting each candidate blank element in the code as avertexand connecting any pair of vertices by anedgesuch that their incident elements cannot be blanked simultaneously for unique correct answers, 3) it derives thecompatibility graphby taking the complement of the constraint graph, and 4) it seeks amaximal cliqueof the compatibility graph to find a maximal set of blank elements with unique answers. For evaluations, I verify the correctness of the algorithm through applications to various C source codes, generate EFP instances for C programs, and assign them to university students. The results confirm the effectiveness of EFP in detecting the students who have difficulty in studying C programmingand the hard topics for them.

In the last contribution, I study thecode completion problem (CCP)in CPLAS by extending

(26)

the works of CCP in JPLAS. For evaluations, I generated 10 CCP instances and assigned them to 54 undergraduate students. The results conﬁrmed the eﬀectiveness of CCP inC programming self-study by novice students. However, the CCP instances onpointerare not solved well by some students, which will be improved in future studies.

1.3 Contents of This Dissertation

The remaining part of this thesis is organized as follows.

In Chapter 2, I review the software architecture and two representative exercise problems in JPLAS.

In Chapter 3, I review the implemented algorithms for generating exercise problems in JPLAS.

In Chapter 4, I propose thecode completion problem (CCP)forJava programming.

In Chapter 5, I propose theelement/code ﬁxing problem (EXP/CXP)forJava programming.

In Chapter 6, I propose thecode amendment problem (CAP)forJava programming.

In Chapter 7, I propose theelement ﬁll-in-blank problem (EFP)forC programming.

In Chapter 8, I study thecode completion problem (CCP)forC programming.

In Chapter 9, I introduce related works in literature.

Finally, in Chapter 10, I conclude this thesis with some future works.

4

(27)

Chapter 2 Review of Java Programming Learning Assistant System

This chapter reviews outlines of the Web-based Java Programming Learning Assistant System (JPLAS)and two representative exercise problems in JPLAS.

2.1 Software Platform for Online JPLAS

Figure 2.1 illustrates the software platform for JPLAS. In JPLAS,Linuxis adopted for OS.Tomcat is used as the Web application server forJSP.JSPis a script language that can embed Java codes on HTML codes. Tomcatreturns the dynamically generated Web page to the client or Web browser.

MySQLis adopted as the database for managing the data in JPLAS.

JPLAS (JSP/Java)

Tomcat (Web server)

MySQL (Database) Linux (OS)

Figure 2.1: Software platform for JPLAS.

2.2 O ﬄ ine Answering Function in JPLAS

In addition to the online platform, students can solve the problem instances in the assignments using the oﬄine answering function on Web browsers.

2.2.1 Operation Flow

Figure 2.2 illustrates the operation ﬂow of the oﬄine answering function in JPLAS.

(28)

1. Problem ﬁles download: a teacher accesses to the JPLAS server, selects the problem in- stances for the assignment at the corresponding page, and downloads the required ﬁles into the own PC.

2. Assignment distribution: the teacher distributes the assignment ﬁles to the students by using a ﬁle server or USB memories.

3. Assignment answering: students receive the files, install them on their PCs, and answer the problem instances in the assignments using Web browsers on online or offline. For offline, the correctness of each answer is verified instantly at the browsers using the JavaScript program.

4. Answering result submission: students submit their ﬁnal answers to the teacher using the ﬁle server or USB memories.

5. Answering result upload: the teacher uploads the answers from the students to the JPLAS server to manage them.

JPLAS server Student PC

1) Assignments download

Teacher PC

2) Assignments distribution

3) Assignments answering 4) Answering

results submission 5) Answering

results upload

Online Offline

Figure 2.2: Operation ﬂow for oﬄine answering function.

2.2.2 Distributed Files

Table 2.1 shows the files to be distributed to the students for the offline answering function in JPLAS. These files are necessary for the problem view, the answer marking, and the answer storage.

2.2.3 Cheating Prevention

In the offline answering function, the correct answers need to be distributed to the students because the answers need to be marked on the offline browser. To prevent disclosing the correct answers to the students, their hash values bySHA256 [13] are actually distributed. In addition, to avoid generating the same hash values for the same correct answers, the assignment ID and the problem ID are concatenated with each correct answer before hashing. Then, the same correct answers for the different blanks are converted to the different hash values, which ensures the independence between the blanks.

6

(29)

Table 2.1: Files for distribution.

File name Outline

css CSS ﬁles for Web browser

index.html HTML file for Web browser page.html HTML file for correct answers jplas2015.js js file for reading the problem list distinction.js js file for checking the correctness of answer

jquery.js js file for use of jQuery sha256 js file for use of SHA256 storage.js js file for Web storage

2.3 Element Fill-in-blank Problem (EFP)

Theelement ﬁll-in-blank problem (EFP)in JPLAS intends for a student to learn the Java grammar and basic programming through code reading.

2.3.1 EFP Instance

In the EFP instance, a Java source code that has several blank elements is given to the students.

Then, they are asked to ﬁll in the blanks of the source code with the correct elements. The correctness of the answer is marked by comparing the student answer element with the original element in the code through string matching. Thus, the original element must be the unique answer for the blank. To help a teacher to generate the feasible EFP instance, theblank element selection algorithmhas been proposed [2]-[14]. Then, using this algorithm, a teacher can generate a new EFP instance by preparing a source code, which should be high quality for code reading.

2.3.2 Element Deﬁnition

An element in EFP is defined as the least unit of the source code, such as a reserved word, an identifier, and a control symbol. Areserved wordsignifies a fixed sequence of characters that has been defined in Java grammar to represent a specific function, which should be mastered first by the students. Anidentifieris a sequence of characters defined in the code by the author to represent a variable, a class, or a method. Acontrol symbolin this paper indicates other grammar elements such as ”.” (dot), ”:” (colon), ”;” (semicolon) , ”(, )”(bracket), ”{,}” (curly bracket).

2.3.3 Lexical Analyzer

For the blank element selection algorithm, the open source softwareJFlex [15] and jay [16] are adopted.JFlexis a lexical analyzer generator for a Java code, which is also coded by Java. It trans- forms a source code into a sequence of lexical units that represent the least meaningful elements to compose the code. It can classify each element in a code into either a reserved word, an identiﬁer, a symbol, or an immediate data.

For example, a statement int value = 123 + 456; is divided into int, value, =, 123, +, 456, and ;. Unfortunately, JFlex cannot identify an identiﬁer among a class, a method, or a

(30)

variable. Thus,jayis applied as well. Since,jayis a syntactic parsing program based on the LALR method, it can identify an identiﬁer.

2.4 Code Writing Problem (CWP)

The code writing problem (CWP) in JPLAS intends for a student to learn writing a source code from scratch.

2.4.1 CWP Instance

In the CWP instance, a student is asked to write the whole source code from scratch that satisfies the specifications described in the test code [3]. CWP in JPLAS is designed and implemented based on thetest-driven development (TDD) methodusing the open source frameworkJUnit[4]- [5]. By running the test code onJUnit, JPLAS automatically tests the correctness of the answer code that is submitted from a student. A teacher needs to prepare the test code to describe the specifications of a new CWP instance in JPLAS.

2.4.2 TDD Method

Thetest-driven development (TDD) methodis reviewed [4].

2.4.2.1 JUint

JUnit assists the automatic unit testing of a Java source code or a class. Since JUnit has been designed with the Java-user friendly style, including the use of the test code programming, is rather simple for Java programmers. InJUnit, one test can be performed by running one method in the library whose name starts with assert . For example, assertEquals method is used to compare the execution result of the source code with its expected value.

2.4.3 Test Code

A test code should be written by using libraries inJUnit. Figure 2.3 shows theMyMathclass source code that will be used to introduce how to write a test code. MyMathclass returns the summation of two integer arguments.

p u b l i c c l a s s Math {

p u b l i c i n t p l u s ( i n t a , i n t b ) { r e t u r n ( a + b ) ;

} }

Figure 2.3:MyMathclass source code.

Then, Figure 2.4 shows the test code to testplusmethod inMyMathclass.

8

(31)

i m p o r t s t a t i c o r g . j u n i t . A s s e r t .∗; i m p o r t o r g . j u n i t . T e s t ;

p u b l i c c l a s s MyMathTest {

@Test

p u b l i c v o i d t e s t P l u s ( ){

MyMath ma = new MyMath ( ) ; i n t r e s u l t = ma . p l u s ( 1 , 4 ) ; a s s e r t E q u a l s ( 5 , r e s u l t ) ; } }

Figure 2.4:MyMathclass test code.

The test code in Figure 2.4 importsJUnitpackages containing required test methods at lines 1 and 2, and declaresMyMathTestat line 3. @Testat line 4 indicates that the succeeding method represents the test method. Then, it describes the procedure for testing the output ofplusmethod.

This test is performed as follows:

1. An instancemaforMyMathclass is generated.

2. plusmethod for this instancema.plusis called with given arguments.

3. The resultresultis compared with its expected value usingassertEqualsmethod.

2.4.3.1 Features in TDD Method

In the TDD method, the following features can be observed.

1. The test code represents the speciﬁcations of the source code, because it must describe every function which will be tested in the source code.

2. The testing process of a source code becomes eﬃcient, because each function can be tested individually.

3. The refactoring process of a source code becomes easy, because the modiﬁed code can be tested instantly.

2.5 Summary

This chapter reviewsJava programming Learning Assistant System (JPLAS)and its two exercise problems. The software architecture of the JPLAS implementation, the oﬄine answering function in JPLAS, theelement ﬁll-in-blank problemand thecode writing problemare discussed.

(32)

Chapter 3 Review of Implemented Algorithms

This chapter reviews the four algorithms that will be used in the studies of this thesis. These algorithms have been proposed in previous studies to automatically generate various types of exercise problems for programming self-studies in JPLAS.

3.1 Review of Blank Element Selection Algorithm

Theblank element selection algorithmselects a maximal number of feasible blank elements from a given source code using a graph theory [2].

The first part of the algorithm is the generation of thecompatibility graphby selecting a candidate element for blank from the source code as avertexand then, by connecting a pair of vertices with anedgeif they can be blanked together. To fulfill this purpose, the conditions that a pair of elements cannot be blanked simultaneously have been defined.

The second part of the algorithm is the extraction of amaximal clique[17] of the compatibility graph, in order to locate a maximal set of feasible blank elements. Empirically, it has been observed that EFP becomes more difficult as a larger number of elements are blanked [2]. By blanking a subset of the selected elements, a variety of fill-in-blank problems can be generated with different levels.

The details of the algorithm procedure are described as follows:

1. Vertex generation for constraint graph: each possible element for being blank is selected from the source code and is regarded as avertexof aconstraint graph.

2. Edge generation for constraint graph: an edge is generated between any pair of two vertices or elements that should not be blanked at the same time to satisfy the uniqueness.

3. Compatibility graph generation: by taking the complement of the constraint graph, the compatibility graph is generated to represent the pairs of elements that can be blanked simultaneously.

4. Clique extraction: a maximal clique of the compatibility graph is generated by a simple greedy algorithm to detect the maximal number of blank elements with unique answers from the given Java code. This greedy algorithm repeats to: 1) select the vertex that has thelargest degree in the compatibility graph for the clique, 2) remove this vertex and its non-adjacent vertices of the graph, until the graph becomes null.

10

(33)

5. Fill-in-blank problem generation: the ratio between the number of blanks for control sym- bols and that for other elements is controlled.

3.2 Review of Coding Rule Check Algorithm

The coding rules [9] represent a set of the rules or conventions for producing source codes of high quality. By following the coding rules, the uniformity of the code will be maintained, which enhances the readability, maintainability, and scalability. Thecoding rulesconsist ofnaming rules, coding styles, and potential problems. The coding rule check algorithm automatically veriﬁes whether the source code follows thecoding rules, and suggests the code parts that do not follow them if available.

1. Naming Rules : naming rules describe the rules for identifying the naming errors in the source code. Here, theCamel case[18] is adopted as the common Java naming rule. For an identiﬁer representing a variable, a method, or a method argument, the top character should be a lower case, where the delimiter character between two words should be an upper case.

For an identifier representing a class, both of them should be an upper case. For an identifier representing a constant, any character should be an upper case. An English word should be used as an identifier name, whereas Japanese or Roman Japanese should not be used.

2. Coding Styles : coding stylesindicate the rules for detecting the layout errors in the source code. They include the position of an indent or a bracket, and the existence of a blank space.

By following the coding styles, the layout of a source code will become more consistent and readable.

3. Potential Problems : potential problems illustrate the rules for discovering the portions of the source code that can pass the compilation but may induce functional errors or bugs with high possibility. They include adead codeandoverlapping codes. Adead coderepresents the portion of the source code that is not executed at all, and overlapping codes signify the multiple portions in the source code that have similar structure and functions to each other. By solving potential problems, the code can not only improve the maintainability and scalability but speed up the execution.

3.3 Review of Error Injection Algorithm

Theerror injection algorithminjects artificial programming errors into the source code by changing elements with similar but incorrect ones. To be specific, this algorithm injects the source code with errors by changing the access modifier of a class or a method, the data type of a method or a variable, and the name of a class, a method, or a variable. Besides, it injects errors into the behavior of the source code by changing the operator, the constant number, the constant variable name in an equation or a conditional expression, or Java keywords defined in thekeyword list. The procedure of the algorithm is as below:

1. Access modiﬁer

An access modiﬁer,public,protected, andprivate, of a class or a method is randomly changed to another one among them.

(34)

2. Data type

For the data type of a method or a variable, any ofbyte,short, int, andlongis changed to double or float randomly, and vice versa. Besides, String is changed to char, and vice versa. Here,voidis not changed.

3. Loop

For the loop statement, any offorandwhileis changed toif.

4. Control

For the control statement,ifis changed towhile, andswitchis changed todo.

5. Java keywords in list

The keywords in the keyword list are changed as follows:

• importis changed toimplements, and vice versa.

• extendsis changed toinstanceof, and vice versa.

• Scanneris changed toSystem, and vice versa.

• staticis changed tosuper, and vice versa.

• continueis changed tobreak, and vice versa.

6. Behavior

An operator, a constant number, or a constant name in an equation or a conditional expression is changed:

• an arithmetic operator, such as+,*,-,or/, is randomly changed to another one.

• a conditional operator, such as>,<,&&,==, or!=, is randomly changed to another one.

• a constant number is randomly changed to the similar number.

• a constant name is changed to another name by applying the error name generation method.

7. Name

A name of a class, a method, or a variable will be switched to another name by applying the error name generation algorithmin the next subsection.

3.4 Review of Error Name Generation Algorithm

Theerror name generation algorithmgenerates the similar name to the original name, which may cause the confusion to the students. The following procedure illustrates the procedure.

1. Error name generation using dictionary

A set of candidates for error names that have similar meanings as the original name, are extracted from the dictionary, WordNet, using the similar word estimating function. Then, one candidate will be randomly selected from this set for the error name.

2. Error name generation using word list

If a proper error name is not found by 1), the word whose spelling is most similar to the original name among the word in theword listis selected for the error name. Theword list needs to be prepared by the user of the algorithm.

12

(35)

3. Error name random generation

If a proper error name is still not found in the second step, the error name will be generated by randomly adding or removing one character, or changing to another character, in the original name.

It is noted that a combined name usingCamel caseis ﬁrst divided into a set of individual names, and then, the above procedure is applied to each individual name. Subsequently, the individual generated error names will be combined into one name.

3.5 Summary

This chapter reviews the four algorithms that will be used in the studies of this thesis to generate new exercise problems in JPLAS. They include the blank element selection algorithm, the coding rule check algorithm, the error injection algorithm and the error name generation algorithm.

(36)

Chapter 4 Code Completion Problem for Java Programming

This chapter presents the code completion problem (CCP) in Java programming learning assistant system (JPLAS).

4.1 Overview of Code Completion Problem

In the code completion problem (CCP), a source code with several missing elements is shown to the students without specifying their existences. Then, a student needs to locate the missing elements in the code and ﬁll in the correct ones there. The correctness of the answer from a student is veriﬁed by applyingstring matchingto each statement in the answer to the corresponding original statement in the code. Only if the whole statement is matched, the answer for the statement will become correct. Furthermore, merely one incorrect element will result in the incorrect answer.

4.2 Generation of Code Completion Problem

In this section, the generation of the new code completion problem (CCP) is presented.

4.2.1 Generation Procedure Overview

An instance of CCP can be generated through the following ﬁve steps:

1. Select a source code from a Website or a textbook that is worth of reading to study the current topic.

2. Apply the coding rule check function to the source code and ﬁx the errors in the code if found.

3. Register each statement in the source code as the correct answer unit for string matching.

4. Apply the blank element selection algorithm to select the blank elements from the source code.

5. Remove the selected blank elements from the source code to generate theproblem codein a new CCP instance.

14

(37)

By using the Java programs and the Bash script, this procedure can be executed automatically.

4.2.2 Source Code Selection

As a sample code, the classFibonacciCalculatoris selected here. This class generatesFibonacci series, 0,1,1,2,3,5,8,13,21,…, recursively, such that each subsequent number becomes the sum of the previous two numbers [11]. There are two base cases: Fibonacci(0) and Fibonacci(1).

4.2.3 Application of Coding Rule Check Function

The coding rule check function is applied to the source code for classFibonacciCalculator. Fig- ure 4.1 shows the corrected source code after the application.

i m p o r t j a v a . math . B i g I n t e g e r ; /∗ ∗

∗ F i b o n a c c i C a l c u l a t o r

∗ @ a u t h o r s t u d e n t

∗/

p u b l i c c l a s s F i b o n a c c i C a l c u l a t o r {

p r i v a t e s t a t i c f i n a l B i g I n t e g e r TWO = B i g I n t e g e r . v a l u e O f ( 2 ) ; /∗ ∗

∗ r e c u r s i v e f i b o n a c c i method

∗ @param number : t o c a l c u l a t e f i b o n a c c i

∗ @ r e t u r n B i g I n t e g e r : r e t u r n s t h e f i b o n a c c i r e s u l t

∗/

p u b l i c s t a t i c B i g I n t e g e r c a l c u l a t e F i b o n a c c i ( B i g I n t e g e r number ) {

i f ( number . e q u a l s ( B i g I n t e g e r . ZERO ) | | number . e q u a l s ( B i g I n t e g e r . ONE ) ) r e t u r n number ;

e l s e

r e t u r n c a l c u l a t e F i b o n a c c i ( number . s u b t r a c t ( B i g I n t e g e r . ONE ) ) . a dd ( c a l c u l a t e F i b o n a c c i ( number . s u b t r a c t (TWO ) ) ) ;

}/∗ ∗

∗ d i s p l a y s t h e f i b o n a c c i v a l u e s f r o m 0−40

∗ @param a r g s u s e d

∗ @ r e t u r n N o t h i n g

∗/

p u b l i c s t a t i c v o i d main ( f i n a l S t r i n g [ ] a r g s ) {

f o r ( i n t c o u n t e r = 0 ; c o u n t e r <= 4 0 ; c o u n t e r++)

S y s t e m . o u t . p r i n t l n ( ” F i b o n a c c i o f ” + c o u n t e r + ” i s : ” + c a l c u l a t e F i b o n a c c i ( B i g I n t e g e r . v a l u e O f ( c o u n t e r ) ) ) ; } }

Figure 4.1:FibonacciCalculatorclass source code.

(38)

4.2.4 Application of Blank Element Selection and Removal

Then, theblank element selection algorithmis applied to the corrected source code to select the blank elements. Finally, the selected blank elements are removed from the code as shown in Figure 4.2.

i m p o r t j a v a . math . B i g I n t e g e r ; /∗ ∗

∗ F i b o n a c c i C a l c u l a t o r

∗ @ a u t h o r s t u d e n t

∗/

p u b l i c F i b o n a c c i C a l c u l a t o r {

p r i v a t e B i g I n t e g e r TWO = B i g I n t e g e r . v a l u e O f ( 2 ) ; /∗ ∗

∗ r e c u r s i v e f i b o n a c c i method

∗ @param number : t o c a l c u l a t e f i b o n a c c i

∗ @ r e t u r n B i g I n t e g e r : r e t u r n s t h e f i b o n a c c i r e s u l t

∗/

p u b l i c B i g I n t e g e r c a l c u l a t e F i b o n a c c i ( B i g I n t e g e r number ) { i f ( number . e q u a l s ( B i g I n t e g e r . ZERO ) | | . e q u a l s ( . ONE ) )

r e t u r n number ;

c a l c u l a t e F i b o n a c c i ( n u m b e r s u b t r a c t ( B i g I n t e g e r . ONE ) ) . a dd ( c a l c u l a t e F i b o n a c c i ( number . s u b t r a c t ( ) ) ) ;

}/∗ ∗

∗ d i s p l a y s t h e f i b o n a c c i v a l u e s f r o m 0−40

∗ @param a r g s u s e d

∗ @ r e t u r n N o t h i n g

∗/

p u b l i c v o i d main ( f i n a l [ ] a r g s ) {

( i n t c o u n t e r = 0 ; c o u n t e r <= 4 0 ;++) . o u t . ( ” F i b o n a c c i o f ” + + ” i s : ” + ( B i g I n t e g e r . v a l u e O f ( c o u n t e r ) ) ) ; } }

Figure 4.2:FibonacciCalculatorclass source code after blank elements selection and removal.

4.3 Solution of Code Completion Problem

In this section, the solution of the generated code completion problem is presented.

16

(39)

4.3.1 Problem Solution by Student

A student can solve CCP instances using the answering interface on a Web browser either oﬄine or online, according to the availability of the Internet connection. In the interface, each input form corresponds to one statement in the source code. Then, the student needs to complete each statement by ﬁlling in all the missing elements at each input form.

4.3.2 Two-Level Answer Marking

The answer is marked throughstring matchingof the whole statement on the JPLAS server using the Java program for online, or at the student browser using the JavaScript program for oﬄine. The statement in the student answer and the corresponding statement in the original code are compared.

This marking is executed in two levels in our implementation. The first level marking is to compare the statements after removing thespacesandtabsfrom them [19]. In Java grammar, any number of spaces or tabs can be inserted in the statement. Besides, it is difficult to distinguish between multiple spaces and one tab. To avoid confusions of novice students, the first level marking does not consider the spaces and tabs in string matching.

However, to encourage a student to be aware ofreadable codewhich follows the coding rules, the locations of tabs and spaces are important. Thus, thesecond level markingis to compare the statements including the spaces and tabs. If the answer code contains a missing tab or space, or extra one, the warning feedback will be returned to the student in the answering interface.

4.4 Implementation of Code Completion Problem in O ﬄ ine JPLAS

In this section, I present the implementation of code completion problem in oﬄine JPLAS.

4.4.1 Problem Answer Interface

Figure 4.4 illustrates the answering interface for CCP on a Web browser in oﬄine JPLAS.

Problem Code shows theproblem codeof the CCP instance. Answer Code shows the answer forms where each line corresponds to one statement in the problem code that may involve missing elements. Then, a student is requested to complete every statement by ﬁlling in all the missing elements with each form.

With the default browser function, tab cannot be input to insert a tab in the statement, since it is used to move to another input form. However, the second-level marking for CCP checks the correctness of the tabs. Thus, in our implementation, the input of tab is made possible in the answering interface by the adoptingJavaScriptfunction shown in Figure 4.3 [19]:

4.4.2 Answer Marking

When a student clicks the answer button in the interface, the two-level marking is applied.

4.4.2.1 First-level Marking

First, the ﬁrst-level marking applies to the answer. Here, after any space or tab is removed from the answer code, each statement is compared with the correct one without a space or tab. If they are diﬀerent, the corresponding input form of the statement is highlighted bypinkto suggest that

(40)

at least one character in the answer statement is diﬀerent from the correct one, and the marking is aborted. Otherwise, the following second-level marking is applied.

4.4.2.2 Second-level Marking

In the second-level marking, the whole statement, including spaces and tabs, will be compared between the answer code and the original code. If they are diﬀerent, it is highlighted in yellow.

Otherwise, the form is not highlighted at all.

f u n c t i o n e n a b l e T a b ( ){

v a r t e x t a r e a s = d o c u m e n t . g e t E l e m e n t s B y N a m e ( ’ i n t e x t ’ ) ; v a r c o u n t = t e x t a r e a s . l e n g t h ;

f o r ( v a r i = 0 ; i < c o u n t ; i++) {

t e x t a r e a s [ i ] . onkeydown = f u n c t i o n ( e ) { i f ( e . keyCode == 9 | | e . w h i c h == 9 ) {

e . p r e v e n t D e f a u l t ( ) ;

v a r s = t h i s . s e l e c t i o n S t a r t ;

t h i s . v a l u e = t h i s . v a l u e . s u b s t r i n g ( 0 , t h i s . s e l e c t i o n S t a r t ) + ”

” + t h i s . v a l u e . s u b s t r i n g ( t h i s . s e l e c t i o n E n d ) ; t h i s . s e l e c t i o n E n d = s+1 ; e . p r e v e n t D e f a u l t ( ) ;

r e t u r n f a l s e ; } }

} }

Figure 4.3:JavaScriptfunction to allow tab in web browser.

4.4.3 Hint Function

To help a student who submits the answer for a plenty of times and still cannot solve a CCP instance, the hint function will be implemented. This function can be initiated after a certain number of incorrect submissions and be used by requesting the incorrect statement number [20].

The bottom two rows in Figure 4.4 show the input and output of the hint function. Here, the student inputs 3 as the incorrect statement number to request the hint. Then, the hints on the corresponding statement appear, such that the location of the missing element,int, is highlighted withblue, and the mistyped element,tmp, is highlighted withpurple. It is noted thattmpshould betemp. By referring to the highlighted elements, it is expected that the student can solve them.

4.4.4 Answer Result Submission

All the answering results of a student are kept in the browser usinglocalStorageforWeb Storage [21]. It can store up to 5MB data and keep it even after the shutdown of the PC. With this tool,

18

(41)

students are allowed to keep an eye on their progress of studies for the assignments.

When a student submits the results, all the answering results in the Web storage are written into a text ﬁle. Then, the student submits this text ﬁle by using a USB memory or an E-mail to the teacher. To prevent the student from falsifying or plagiarizing the results of other students at submissions, the message authentication technique is adopted. Using SHA256 [13], the hash value of the answering result with the student ID is calculated before submission. Then, the coincidence between this hash value and that of the submitted data is evaluated.

Figure 4.4: Interface of problem answering in oﬄine JPLAS.

(42)

4.5 Implementation of Code Completion Problem in Online JPLAS

In this section, I present the implementation of the code completion problem for online JPLAS.

4.5.1 Problem Data in Database

In online JPLAS, each instance in any problem type is managed by a single text ﬁeld in the corresponding table in theMySQLdatabase. This problem data ﬁeld contains the markers shown in Table 4.1 to manage various data that are necessary for each problem type.

Table 4.1: Markers in problem data ﬁeld.

Marker description

//@JPLAS answer following paragraph contains correct answers

//@JPLAS statement following paragraph contains problem statement

//@JPLAS output following paragraph contains html format data for interface

null problem code

4.5.2 Problem Answer Interface

The answer interface for oﬄine JPLAS including thehint functionis used for online JPLAS. Thus, a student can use both the oﬄine and online JPLAS easily. The submitted answer is stored in the database directly.

4.5.3 Answer Marking

The two-level marking for offline JPLAS is also applied for online JPLAS. The implementation in the server adopts theresponsibility chaindesign pattern. Here, first, thestring matching testis performed at the two-level marking for CCP. Then, if this test is passed, thecompiling testwill be automatically initiated to examine the correctness of the answer code, because the first-level test excludes the spaces and tabs from it.

4.6 First-Step Evaluation

In the first-step evaluation, I excluded thefirst-level markingand thehint functionfrom the answer interface for CCP in both offline/online JPLAS, because they were not implemented for EFP.

4.6.1 Problem Instances

To compare the solution performances by students between CCP and EFP, we generated six pairs of CCP and EFP instances such that the two instances in each pair appear to have the similar level of diﬃculty. Table 4.2 shows their overviews. The generated instances are divided into two groups by selecting one instance from each of the six pairs, so that each group consists of three CCP and EFP instances, respectively.

20

(43)

Table 4.2: Instances for ﬁrst-step evaluation.

pair grammar LOC #of missing elements

ID topic CCP EFP CCP EFP

1 variable 23 14 15 16

2 array 17 30 18 19

3 collection 30 17 19 18

4 recursive 14 23 16 15

5 method overloading 20 25 24 6

6 polymorphism 25 20 6 24

4.6.2 Solution Results by Students

Then, we asked 20 university students from Myanmar, Japan, China, Indonesia, and Kenya who have studied Java programming for more than one year to solve the generated instances in either group. That is, each student is randomly assigned one group, such that the equal number of students will solve one group.

Table 4.3 demonstrates the average and the standard deviation (SD) of the correct solution rates (%) per student for CCP and EFP. Clearly, CCP exhibits the worse result than EFP, although the original source codes have similar difficulties and the same number of blanks was generated from the same source code. This result has confirmed that CCP is more difficult than EFP, which requires more careful code reading.

Table 4.3: Summary of correct solution rates for ﬁrst-step evaluation.

correct rate (%) CCP without improvements EFP

ave. 82.46 93.92

SD 16.95 9.36

4.6.3 T-test Veriﬁcation

To confirm the above-mentioned result statistically,T-testwas applied here, which can determine if the two sets of data are significantly different from one another. T-testis one of the statistical tests used for hypothesis testing.

Thenull hypothesisis assumed to show no diﬀerence between the solution results for CCP and those for EFP. Then, it is decided to accept or reject this null hypothesis. According to [22], there are two approaches to determine whether thenull hypothesisis accepted or rejected. In thecritical value approach, iftest statisticsis greater thancritical value, thenull hypothesiswill be rejected in favor of the alternative hypothesis. In theP-value approach, ifP-valueis less than or equal to Alpha-level, thenull hypothesisis rejected. Here, I adopt both approaches.

Table 4.4 indicates the T-test result. In Table 4.4, test statistics is greater than test critical, which means the rejection of the null hypothesis. Also,P-valueis smaller thanAlpha-level, which means the rejection of it. Therefore, it is concluded that the solution result statistics of students are significantly different between the two problems, and the code completion problem is more challenging than the element fill-in-blank problem.