In our August blog post, we took a closer look at the role of test content specifications in developing the next generation of the bar exam. This month, we’re ready to take the next step and tackle the ins and outs of the test design, which specifies the materials and procedures necessary to develop, administer, and score a high-stakes assessment. Test designs include the test content specifications, but that’s just part of the picture; they also specify other logistic details of a testing program, such as test administration frequency, number of pretest items, and target reliability coefficient.
It is helpful to think of a test design as comprising five interrelated components: test administration, item formats, test content scope and weighting, organization and timing of the exam, and scoring and psychometric methods. For new or substantially retooled exam programs like the next generation of the bar exam, the test design is necessarily a work in progress during the early stages of development. This is because early in the process, there is not sufficient information to make certain design decisions. Accordingly, NCBE has developed a preliminary or target test design for the new exam to focus our research and development efforts. The target test design is a dynamic document that will be updated periodically as our work on the new exam progresses. As an example, we anticipate that the new bar exam will include short-answer items; however, additional research is needed to better understand how this item format performs (e.g., examinee response time, scoring procedures, and item difficulty parameters). Only then can the test design specify how many short-answer items to include on a test form.
This blog provides an overview of each component of the target test design for the next generation of the bar exam as it stands currently, with the understanding that many details for the design of the new exam are still in progress and therefore listed as “TBD” (to be determined).
Test Administration
The test administration component of the test design encompasses issues such as delivery method, exam length, administration frequency, and exam structure. The current details of this component of our test design are provided below.
Delivery Method | Proctored, secure, computer-based delivery administered at computer testing centers or on candidate laptops at jurisdiction-managed sites |
Exam Length | Twelve hours of testing; testing time may be reduced if reliability and validity targets can be met |
Admin Dates | Last Tuesday/Wednesday of July and February; some flexibility in frequency and dates may be possible if exam is administered at computer testing centers |
Exam Structure | Block structure with specifics TBD (e.g., twelve 60-minute sessions, eight 90-minute sessions, or six 120-minute sessions) |
Item Formats
The next generation of the bar exam will contain selected-response and constructed-response item formats. Each item will be presented either as a stand-alone item or as part of an item set. In addition to traditional multiple-choice items, NCBE is exploring the use of other selected-response formats such as rank/prioritize and multiple-correct. The constructed-response items will include short-answer, medium-length, and extended-length formats. The number of items and allocated response time for each item format remains to be determined.
Content Scope and Emphasis
Test material will be drawn from the seven Foundational Skills (FS) and eight Foundational Concepts & Principles (FC&P) identified by the Testing Task Force’s study. While the FS domains and FC&P subjects are distinct, they can be interdependent for assessment purposes; that is, skills and subjects can be crossed such that a test item could be classified into both a FS domain and an FC&P subject. The Testing Task Force’s Phase 3 report noted preliminary weights for the FC&P subjects and FS domains, but the weights will be reviewed and finalized as part of developing the test content specifications.
Organization and Timing
As noted above, we tentatively plan to structure the exam in blocks or testing sessions of set duration. Some possible frameworks include organizing blocks by item format or by practice-related themes. For example, in a theme-based framework, test items in the first block could address new client interview and issue spotting; items in the second block could center around advising a client; the third block might require examinees to draft a transactional document, and so on. It also remains to be determined how scored items, pretest items, and equating items will be distributed across testing blocks.
Scoring and Psychometrics
Best practices in assessment require that test scores be sufficiently reliable to allow for accurate pass-fail decisions, and that test form content and score scales for an exam remain comparable over time. Some of the key psychometric targets are noted below.
Scoring: selected-response items | Centralize scoring of selected-response items by NCBE |
Grading: constructed-response items | TBD, but ways to increase standardization of grading are being considered; NCBE also plans to investigate some level of computer-assisted grading for short-answer items |
Test Fairness | Item development and pretesting with diverse groups; items are independently reviewed for cultural sensitivity and potential bias |
Pass-Fail Decisions | Compensatory model with pass-fail decision based on total score; NCBE will conduct standard-setting study to recommend a range of passing scores for jurisdictions to consider |
Reliability | Target of 0.90 for total test score |
Equating | TBD, but a common-item non-equivalent groups design is likely |
Pretesting | New items embedded in live test forms for selected response and short answer; longer constructed-response items pretested outside live test administration setting |
The target test design will remain a work in progress for at least the next two years, with both item formats and organizational themes subject to change based on research, development, committee work, and input from stakeholders. We look forward to sharing updates with you in future blog posts as this work proceeds.