Language Assessment (Showa, Spring 2022)

Welcome to Language Assessment, which is, as the name suggests, a course about advanced assessment of foreign language learners.

In order to examine the basic principles of test construction and testing procedures, this course will examine topics such as measurement constructs and models of language knowledge, test reliability, the design of tests and assessments, item and task construction, scoring and rating tests and assessments, the training of raters, and issues of fairness and standards. There is a special focus on the area of validity, and different perspectives on validity are also introduced, including the use of arguments and evidence in the support of test validation. Students will participate in group discussions, take a midterm and final examination, produce a course validation project in which they (a) experience the process of conceptualizing the theoretical bases of an assessment instrument, (b) produce an instrument designed to measure a particular language skill (e.g., reading, grammar) or affective variable (e.g., motivation, self-efficacy), (c) gather and analyze data using that instrument, and (d) write a report on the strengths and weaknesses of the instrument. Finally, there will also be a short final paper.

If you would like (for some inane or perhaps insane reason), here is the course syllabus for your reading pleasure. Of course, I reserve the right to amend it, so please treat this as a guideline.

For those of you that have taken courses from me, you will recall that we have had no textbooks, and in the present course we will also have no textbook. Class material will be available on both Dropbox and Google Drive, and feel free to download, save, print, or otherwise use it. I seldom make paper handouts, but if you prefer paper, please take care of printing files yourself.

You will find course requirements near the bottom of this page.

Hereafter you will find a reasonably detailed synopsis of the various class meetings ("sessions") that we will enjoy this term.

Sure, you were about to ask, right? That, Good People, is a sextant, a very necessary tool for navigation in the days of sailing ships (i.e., in the pre-GPS age). There is, in a most appropriate nod to our course, a strong mathematical background to this beautiful tool.

Before Session #1 on April 12, 2022 — Flipped Preparation

Good day, everyone, and welcome! I thank you for choosing to take this course, which I sincerely hope you'll find beneficial. As you'll see below, I am asking you to prepare the following homework for next week's class. Thus, throughout the course we will utilize this format: any homework assigned is for the following class.

Homework:

✔ Be prepared to describe your own research or research plan.
✔ Kane & Wools (2020, pp. 17-25); please read this and be prepared to discuss (a) the major points, (b) issues you found interesting or perhaps difficult, and (c) ___________.

Session #1 (April 12, 2022) — Introduction

As often happens on the first day of class, we'll be speaking in somewhat general terms about our course and some of the topics therein. First, note that this course includes all four skills that appear in academic work: speaking, listening, reading, and writing. There are, of course, certain requirments for this course, as is true for every course; please scroll down to the specific course requirements at the bottom of this page.

Homework:

✔ Title page example
✔ Fulcher & Davidson, Unit A1, Introducing validity (pp. 3-22)
✔ Watch this explanation of Bloom's texonomy, courtesy of the U of Illinois' Center for Innovation in Teaching & Learning.
✔ Journal review homework

Session #2 (April 19) — Validity

Validity in assessment is a deceptively simple concept that refers to the degree to which a test measures what it is intended to measure. Moreover, because tests do not exist in isolation but are put to some use (e.g., streaming or grading), validity includes the purpose for which test results are used.

Homework:

✔ Fulcher & Davidson, Unit A2, Classroom assessment (pp.23-35)
✔ Fulcher & Davidson, Unit B1, Construct validity (pp. 181-191); prepare Task B1-2 (pp. 182-183)
✔ (optional) Moss (2003) Reconceptualizing validity for classroom assessment [This is the article upon which Unit B1 is based; this is just for reference.]
✔ An excellent page by Pritha Bhandi about construct validity

Session #3 (April 26) — Classroom Assessment; Construct Validity

Today's class will consist of a discussion of construct validity. As you will have read on Pritha Bhandi's Scribbr page, this is ...

In my years ...

Class Material:

✔ Fulcher & Davidson, Unit B2, Pedagogic assessment (pp. 192-202)
✔ Beglar (2013) A broad conceptualizatin of extensive reading

Note: We will have no class next week on May 3. Please enjoy your Golden Week!

Session #4 (May 10) — Pedagogic Assessment

Today we will ...

Let's continue with a look at ...

Class Material:

✔ Fulcher & Davidson, Unit C2, Assessment in school systems (pp. 298-303)
✔ Nunes (2004) Portfolios in the ESL classroom: Disclosing an informed practice
✔ Evaluating Alignment in Large-Scale Standards-Based Assessment
✔ Computer-Based Assessments and Technology from CCSSO

Session #5 (May 17) — Assessment in School Systems

This final discussion of classroom assessment focuses on the necessity of balancing external guidelines with assessment conducted in the classroom. As we noted in earlier discussions, the variability in teachers, instruction, and pedagogy poses considerable challenges to creating and conducting assessments that are valid.

As noted on page 303 of the C2 Unit of Fulcher and Davidson, the webpage of the Council of Chief State School Officers (CCSSO) includes a host of information on assessment with the express goal of facilitating coordination between states and the federal government in the United States.

Next week we will be talking about some of the different theoretical models that have been advanced over the last few decades. Please become familiar with the following.

Models:

✔ Canale & Swain (1980)
✔ Canale (1983, 1984)
✔ Bachman (1990) Communicative Language Ability
✔ Bachman & Palmer (1996) Refinement of the CLA Model
✔ Celce-Murcia, Dornyei, & Thurrell (1995) Communicative Competence
✔ Markee (2000) Interactional Competence

Class Material for next week:

✔ Fulcher & Davidson, Unit A3, Constructs and Models (pp. 36-51)
✔ Schaefer (2008 or 2012)
✔ (optional) Piggin (2012) What are our tools really made out of?

Session #6 (May 24) — Constructs and Models

In this discussion we will be examining the triad of theoretical models, frameworks for assessment, and test specifications. These narrow the focus from a grand, overarching theory down to the 'nitty-gritty' that can be actually utilized for constructing test items.

On page 46 we find the interesting assertation that a checklist based on Bachman and Palmer's model can inform the development of test items.

Class Material:

✔ Fulcher & Davidson, Unit A4 Test specifications and designs (pp. 52-61)
✔ Brown & Abeywickrama, Chapter 7, Listening (pp. 156-182)
✔ Entrance exam test booklet
✔ Entrance exam comments [to be added]
✔ Rust (2002) Purposes and principles of assessment

Note: We will have no class next week on May 31.

Class Material:

✔ Rasch.org
✔ jMetrik
✔ Data set 1
✔ Data set 2
✔ Excel Spreadsheets for CTT Analysis (The actual files are in our class readings folder.)

Session #7 (June 7) — Test Specifications and Designs

This chapter deals with the specs and actual creation of test items. As noted, test items can follow existing formats or can be original, but there should be some set of guidelines upon which to base the items.

The Brown and Abeywickrama chapter provides a useful example of various types of listening items. I'd like to direct your attention to the micro- and macro-skills list on page 163.

In our next class we will be looking some at statistics related to testing. Please look over the following links to become slightly familiar with these terms, but you do not need to study them intensively.

Class Material for next week:

✔ Point-biserial correlations coefficients
✔ JALT Shiken topic index
✔ Item facility and item discrimination
✔ Distractor efficiency analysis on a spreadsheet
✔ Boone & Noltemeyer (2017) A Rasch primer
✔ Knoch & McNamara (2015) Rasch analysis

Session #8 (June 14) — Item Response Theory; Winsteps Introduction

Good morning, everyone. Today our class will be devoted to CTT and—to a small extent— item response theory. In the links provided last week was some very useful information from the JALT Shiken SIG.

Homework:

✔ Boone & Moltemeyer (2017) A Rasch primer
✔ A Conceptual Intro to Item Response Theory

Session #9 (June 21) — Writing Items & Tasks; Winsteps, Part 2

Today we will devote our time to practicing some with Winsteps, which is software (paid, unfortunately) for conducting Rasch analysis. An alternative is the open source (and thus free) software called jMetrik. As you'll see, jMetrik accomodates item analysis, differential item functioning, item response analysis and Rasch analysis (just check the Analyze tab).

This class will be devoted to ...

Homework:

✔ Fulcher & Davidson, Unit A6 Prototyping and Field Tests (pp. 76-89
✔ ...

Session #10 (June 28) — Prototyping and Field Tests; Facets Introduction

We will also delve some into ...

Class Material:

✔ Fulcher & Davidson, Unit A7 Scoring language tests and assessment (pp. 91-114)
✔ Schaefer (2012)
✔ FACETS output file (5-factor model)
✔ FACETs introduction

Session #11 (July 5) — Scoring Language Tests & Assessments; Facets, Part 2

Good day, everyone. Today we will wander through Unit 7A, which deals with scoring language tests. We will also look some at the basics of FACETS, which is just pumped-up Rasch analysis.

Class Material:

✔ Journal review due
✔ Shomhamy (2001) Democratic assessment as an alternative
✔ Fulcher & Davidson, Unit A9 Fairness, ethics, and standards (pp. 138-158)
✔ How ETS Approaches Testing: Quality and Fairness

Session #12 (July 12) — Fairness, Ethics, and Standards

In today's class we will be considering the very nature of testing and where it falls in terms of its role in society, for institutions, and for individuals. This discussion is at points very philosophical, yet it is of the utmost significance because of the uses to which testing can be put.

Class Material:

✔ Fulcher & Davidson, Unit A10 Arguments and evidence (pp. 138-158)
✔ Presentation basics

Session #13 (July 19) — Arguments and Evidence in Test Validation and Use

Good morning, everyone. Today we will walk through ...

Class Material:

✔ Fulcher & Davidson, Unit A10 Arguments and evidence (pp. 138-158)
✔ Presentation basics
✔ Luke Harding on "Language Assessment: Current Trends, Future challenges"

Session #14 (July 26) — Project Presentation(s); Current Topics in Testing

Our class today will begin with presentations about student projects. Following those, we will have a look at recent topics in the field of language assessment.

Class Material:

✔ Annotated bibliography due
✔ ...
✔ ...

Session #15 (August 2) — Course summary

Homework:

✔Course paper due
✔ ...

Course Requirements

In principle, three absences will be allowed, beyond which the student's overall grade will be reduced.

Regular and active participation in class is expected. (10%)

Required writing will include the following (70%):

✔ Research paper presentation (10%);
✔ Journal review (15%)
✔ Annotated bibliography (25%)
✔ Course paper on instrument development and results (50%)