Requirements in Preparing and Using Foreign Language Test
                          By miftachudin arjuna - March 27, 2015
I. Preface  
It is commonly felt that the word “testing” frighten both students and teachers. For the former, the students fear to be put to test because there is a possibility of failure. For the latter, the teachers have fear of statistics. They think that they will be involved in the statistical trap. Teachers also fear that they might be unable to make a fair judgment and appropriate tests in the available time. 
One crucial way is for teachers to set the goals of teaching and describe them just in the beginning of teaching learning activity to the students.  
II. Analysis 
There are four requirements to be considered when developing and judging formal foreign language tests. 
1. Validity 
Test validity is the most crucial factor in the foreign language testing. A test is valid when it measures effectively what is intended to measure, whether it is be achievement, proficiency, or aptitude in the language. A test may be designed to measure the integrative abilities or discrete items within the subsystems of a language. But, for example, if a test is designed to measure reading comprehension, it must do exactly this and not attempt to measure another skill such as aural – oral comprehension. If a test is intended to measure one’s speaking skill, it is valid only if it tests his / her speaking ability. 
In order to achieve validity, the test components and items must have a close relationship with what is tested. Two questions must be considered when determining test validity of a foreign language; what aspects of the language is the test design to measure, and how well does it measure? 
The validity of a language test is determined by comparing the result it gives with some outside or independent criterion. Here are five validity concepts 
a. Content validity 
Content validity is examined by checking all items in the test to make certain that they correspond to the instructional objectives of the course, whether they are be discrete or integrative language skill. 
b. Concurrent validity 
A procedure for examining concurrent validity could follow the following pattern: a new language test is administered to students in the course and scores are recorded for each student. 
c. Predictive validity 
Predictive validity is used extensively in the validation of language aptitude tests. It is determined by (a) administering the test to a group of prospective students, (b) following their progress in their language courses, (c) obtaining data about each of them, then (d) computing the correlation coefficient between the aptitude test scores and the criterion grades or data. 
d. Construct validity 
Construct validity is less important in the achievement tests, but this validation should be extensively used in aptitude and proficiency test.  
2. Reliability 
The reliability or consistency of a test is the degree to which the test can be trusted to produce the same results when the test is administrated at different time by an individual. It is different from validity in that it concerns about the dependability of a test, rather than whether it measures what is intended to measure. Thus, to be reliable a test must be consistent in giving a result over a period of time. If a test gives almost identical results or scores each time it is administered to the same group, we can say that the test is highly reliable. When it gives significantly different result, the test is unreliable. 
There are several external factors affecting the reliability of a test; variations in testing conditions (lighting, temperature, noise, and other distractions); test compromise (i.e., the students knew the questions / answers before hand); inaccuracy in scoring; inadequate sampling of test items; lack of motivation; fatigue or illness. Those factors decrease the reliability of the test. 
On the other hand, these factors can improve the reliability of a test; making the test conditions constant; using the same procedure in administering the test; increasing the number of test items; giving adequate sampling of test items; and reducing the subjective scoring. 
A test which is reliable is not necessarily valid. A test intended to test the native English speakers for reading comprehension of the Indonesian language may give the same result when given to the native Indonesians, but will show low validity when used to measure the writing ability of English speakers. A language test may have both high reliability and low validity. 
Increasing the length of a test will increase its reliability. For a multiple-choice test, the number of choices will influence its reliability. For a better examination of reliability, two methods should be used, one being a check against the other.  
3. Comprehensiveness 
A test is comprehensive when it contains items from all aspects of the materials to be tested. An achievement test, for example, should contain all the objectives which should have been achieved by the time of testing. It is recommended that the number of items be in proportion to the importance of language being tested, or to the proportion of the problem areas of the students. For a phase test or end-of-semester test, approximately sixty questions can be considered the minimum number required to provide valid reliable scores. For a proficiency test used for screening purpose of for obtaining a comprehensive measure of the examinee’s mastery of language, the number of questions should be about twice as great, approximately 100 to 120 items. 
It has been shown by experiments that when a test is too long, there are problems of administration, economy, and fatigue which destroy the test utility. On the other hand, a test which is too short gives problems of compromise, doubtful reliability, and lack of comprehensiveness. 
In preparing a test, the examiner should make sure that the test items are representative, comprehensive, and related to the objectives of the course or to the purpose for which the test is intended.  
4. Practicality 
Practicality is also as important factor in the design of a test, in addition in validity, reliability, and comprehensiveness. It is important when the test is used in large quantities for placement, selection, certification, or classroom use. The criteria for practicality are based on such factors as economy, scorability, and administrability.  
III. Conclusion 
Those fear of students and teachers will removed when they open to a wider context of testing. The reason for making goals clear to the students is the fact that all teaching and learning activities are directed to and focused in the goals. 
To make a fair judgment and appropriate tests, teachers should consider major qualities when developing a test; they are validity, reliability, comprehensiveness, and practicality. 
Validity measures effectively what is intended to measure, whether it is be achievement, proficiency, or aptitude in the language. A test may be designed to measure the integrative abilities or discrete items within the subsystems of a language. When reliability of the test components and items must have a close relationship with what is tested. Thus, to be reliable a test must be consistent in giving a result over a period of time. 
A test is comprehensive when it contains items from all aspects of the materials to be tested. An achievement test, for example, should contain all the objectives which should have been achieved by the time of testing. It is recommended that the number of items be in proportion to the importance of language being tested, or to the proportion of the problem areas of the students 
In preparing a test, the examiner should make sure that the test items are representative, comprehensive, and related to the objectives of the course or to the purpose for which the test is intended


0 comments