Online assessment in higher education: a mapping review and narrative synthesis

Online assessment takes many forms. While there have been reviews on a particular online assessment approach (e


Introduction
Technologies are pervasive in assessment practices in higher education.They have been used in various ways to support exam delivery, manage assignment submission, improve assessment feedback, and, not the least, store and communicate assessment results (Bearman et al., 2023).In the contemporary world where work and lives are increasingly mediated by technologies, they are also essential to assessment tasks that seek to prepare university graduates for future careers and lives (Geertshuis et al., 2022).The recent shift to remote delivery of teaching further rendered technologies high-profile, as university educators had to leverage the affordances of technologies to enable assessment in online environments (Lee & Fanguy, 2022).
Within the research literature, there has been ongoing effort dedicated to investigating the intersection between technology and assessment.This research was initially termed computer-assisted assessment (Conole & Warburton, 2005) and more recently as e-assessment or online assessment (St-Onge et al., 2022).The change in the use of terms reflects the progressive development and application of online technologies in assessment.
Early explorations relied on the use of tailor-made computer programs to assess learning within specific disciplines or institutions (Fiddes et al., 2002).Recent developments involved the application of a breadth of online tools, platforms and resources to assessment practices, often across disciplines and institutions (Lee & Fanguy, 2022).
Findings generally confirmed the potential of online technologies in enhancing or transforming assessment practices but at the same time identified challenges, including but not limited to the need to upskill staff in assessment design (Snodgrass et al., 2014), ensure technology accessibility and scalability (Avila et al., 2016), prepare students for online assessment (Brink & Lautenbach, 2011), authenticate learners (Levy et al., 2011) and prevent academic misconduct (Buckley & Cowap, 2013).
While empirical research abounds internationally, there have been limited knowledge syntheses on online assessment in higher education.Among them, Gikandi et al. (2011), through a review of 18 studies, concluded that online formative assessment provided additional learning opportunities and improved learner engagement, but cautioned the need for designing authentic and engaging assessment tasks.Boitshwarelo et al. (2017), through a review of 50 studies, suggested that online tests can be appropriate for assessing twenty-first century learning when they are used formatively and in combination with other types of assessment.A recent review of 36 studies further suggested that while online examinations could support student learning and performance and reduce staff grading workload, they need to address issues relating to accessibility, validity, reliability and potential cheating during implementation (Butler-Henderson & Crawford, 2020).
The above reviews allowed us to gauge the efficacies of and challenges associated with a particular online assessment approach.They are, however, focused on one assessment function (e.g., formative assessment) or task (e.g., online tests or examinations), which is restricted in scope and may overlook the differences and commonalities among different online assessment approaches.A comprehensive review would capture this diverse body of research and establish a more holistic understanding of current research into online assessment.Such a review would also help educators understand the current landscape of online assessment research and become informed and versatile as they seek to redesign and transform assessment through online technologies.
To this end, we conducted a comprehensive review of online assessment in higher education.The review aimed to establish the current understanding of online assessment For this review, we define online technologies as digital devices, tools and platforms that connect users to the internet.We regard assessment broadly as the process of making a judgement about student work or performance, which can be derived from the student, peers, teachers or machine (Bearman et al., 2023).We accept the notion of assessment of learning, which emphasizes the summative purpose of assessment as a tool for measurement and accreditation, and the notion of assessment for learning, which emphasizes the formative role of assessment in promoting learning (Wiliam, 2011).
However, we also align with Boud et al. (2018), acknowledging that a clear boundary between formative and summative assessment functions may be hard to observe in contemporary assessment practices.Thus, our orientation towards online assessment is inherently broad: online assessment refers to the assessment that takes place in the online environment or uses online technologies during the assessment process.Importantly, we do not restrict online assessment to online or distance learning modalities.While fully online or distance courses necessitate the use of online assessment, other modalities such as web-enhanced and blended learning have also adopted online assessment practices.
Our review therefore considers online assessment across different learning modalities.

Method Article search and screening
The review procedure (Figure 1) followed the PRISMA guideline (Page et al., 2021).Two reviewers (QL and AH) conducted title and abstract screening and full-text screening independently.Articles were included if they were (1) empirical studies, (2) in the higher education context, and (3) focused on online assessment.Articles were excluded if they were (1) reviews, theoretical or opinion pieces, (2) outside of higher education, (3) not on assessment (e.g., online courses or learning tools), or (4) not on online technologies (e.g., assessment in general).Disagreement between the reviewers was discussed to reach a consensus.Given that the included articles adopted diverse research designs and our review, unlike systematic reviews, did not serve to inform interventions in practice, we conducted quality appraisal by way of ensuring that included articles reported research aims, questions or methodological details.This is in line with the recent quality appraisal guide that recommends using quality appraisal measures to understand the quality of research findings rather than to screen studies (Hong et al, 2018).In total, 382 articles were included after title and abstract screening, and 235 included after full-text screening and quality appraisal.

Analysis and synthesis
We combined mapping review and narrative synthesis methods to answer the stated RQs.
Mapping review focuses on categorizing studies on a given topic to enable contextualized knowledge synthesis (Grant & Booth, 2009).We used this method to first inductively identify major online assessment approaches reported in the literature.For example, studies on the use of formative quizzes with multiple-choice questions were coded as 'formative quizzes' and grouped together with studies using quizzes with drop-down menu questions and short answer questions.Studies coded as 'formative quizzes', 'diagnostic tests', and 'examinations' were further grouped together as 'tests', which was identified as an assessment approach.
We then drew on the SAMR (Substitution, Augmentation, Modification and Redefinition) model to further map out variations in technology use across assessment approaches.The SAMR model provides a useful understanding of technology use in education, and has been widely applied to examine the role of technology in pedagogical activities (Crompton & Burke, 2020).The model specifies four levels of technology use, from the lowest level of substitution to augmentation, modification and eventually redefinition (Hamilton et al., 2016).Substitution refers to using technology as a direct tool to substitute existing practices without functional improvement.Augmentation refers to using technology to substitute existing practices with functional improvement.
Modification refers to the use of technology for significant task redesign, and redefinition refers to the use of technology to create new tasks or practices previously inconceivable.
In this review, we adapted the definitions of the four levels of technology use so that they are relevant to online assessment (see Table 1) and used the model to map out the role of technologies in online assessment.

Substitution
Digital technology re-produces the existing assessment practice that can be achieved without the digital technology.
Once we categorized online assessment approaches and the variations in technology use, we used the narrative synthesis method (Popay et al., 2006) to synthesize research findings within an assessment approach and at a particular SAMR level.This was an iterative process involving textual analysis for synthesizing and comparing research findings within and across different online assessment approaches and ways of technology use.

Result Mapping of current research
There has been a gradual increase in research (Figure 2).The first half of the review period (2011)(2012)(2013)(2014)(2015) contributed an average of 17 articles per year, and the second half contributed an average of 24 per year.
Three major online assessment approaches were identified in this literature, which were (1) tests; (2) assignments, and; (3) skills assessments.Despite the overall increase in research, tests remained the major online assessment approach (65%), followed by assignments (20%), and skills assessment (15%) throughout the review period (Figure 3).

Online tests
Tests (n=152) were the most common online assessment approach, which focused on assessing learners' lower level of cognition, such as recall or comprehension (Krathwohl, 2002), through the use of online technologies, often in the form of 'quizzes', 'tests' or 'examinations'.This approach typically involved students responding to structured, objective or semi-objective questions.Multiple-choice question was widely adopted within this approach (Wilson et al., 2011), although there were reported use of more sophisticated question types such as long drop-down menu questions (Huwendiek et al., 2017) or open-ended questions (Zlatovic et al., 2015).
There were considerable variations in what the assessment served and the way the assessment was implemented.Quizzes were implemented regularly during learning to help students self-assess their understanding, serving for formative assessment purposes (Thomas et al., 2017).Tests were used for both summative and formative purposes during the course of learning (e.g., pre-learning tests, or mid-term tests): they served to diagnose learners' existing knowledge prior to learning (Carr et al., 2017), were implemented during learning to provide feedback (Balta et al., 2018), or contributed to summative assessment (Bausili, 2018).Examinations on the other hand often served for summative purposes that directly contributed to learners' final grades and performance evaluation (Daffin & Jones, 2018).
Task outcomes included written pieces (i.e., essays and reports) or digital artefacts (i.e., websites or videos), and were used for self-assessment (Hwang et al., 2015), peerassessment (Lin, 2019) or teacher-led assessment (Gray et al., 2012).Unlike tests, assignments usually took an extended period of time and were fully embedded in the process of student learning: they had a clear developmental emphasis, encouraging students to develop higher level cognitive skills and capabilities (e.g., application, analysis, evaluation and creation) through ongoing engagement with the assessment task, learning resources and, in many cases, their peers.
Although various online technologies were used to facilitate assignment tasks, students were likely to switch between online and offline.For instance, some studies (Fernando, 2018;Noroozi et al., 2016) reported having students write essays, which only involved using a few databases and a word-processing package but subsequently introduced additional online platforms for peer assessment.

Online skills assessments
A further approach focused directly on assessing student performance in practice, which involved the application of multiple skills at different levels (n=36).Given the diversity of skills being assessed, the assessment practices varied considerably so did online technologies being used to support the assessment.In the most straight-forward form, online technologies were used to develop electronic rubrics to help assess, for instance, second language skills or professional attitudes (Haack et al., 2017).More complex technology use involved assessing student performance with higher level of authenticity through simulation (Craft & Ainscough, 2015), using video assessments to capture student performance (Hay et al., 2013), and enabling portfolio-based assessments to integrate aspects of student development, performance and reflection (Tinoca & Oliveira, 2013).

Variations in the role of online technologies
Regarding the differences in the way online technologies were used, around 34% of studies were assigned at the 'substitution' level, 34% were at 'augmentation' level, 17% were at the 'modification' level and 16% were at the 'redefinition' level.Throughout the review period, studies assigned at the substitution and augmentation levels dominated the sample (Figure 4).
Our mapping of the studies against the SAMR model further showed variations in technology use across assessment approaches (Figure 5).For tests, 47% of studies used online technologies to substitute existing assessments, 39% used online technologies to augment existing assessments, 11% used online technologies for modification purposes, The mapping analysis showed that while most studies focused on using online technologies to substitute or augment existing assessments, the functions that online technologies served depended on the assessment approach.While tests predominantly integrated online technologies at lower levels (i.e., substitution or augmentation), assignments and skill assessments focused more on online technologies' modification and redefinition potential.

Synthesis of current research
This section describes the result from a narrative synthesis of current research into each of the three online assessment approaches.Within each approach, studies were organized according to their assigned SAMR level to expose similarities and differences in technology use.

Current research into online tests
Tests at the substitution level.Most studies on using online technologies for tests were categorized at the substitution level (n=71).Technologies were used to deliver existing paper-based tests online.Students were attitudinally mostly positive towards online tests because of the efficiency in assessment delivery (Deutsch et al., 2012).However, there were concerns regarding data security and calls for measures in data management and user authentication (Brink & Lautenbach, 2011).Studies also captured issues relating to fairness in online tests (Liu et al., 2015), suggesting that online tests may disadvantage learners who had low self-efficacy in using technologies, especially if the learners had never sat online tests before (Daniels & Gierl, 2017).
Although none of the studies focused on measuring the effectiveness of institutional interventions that better prepare learners for online tests, there was a consensus that effective delivery of online test that substitutes existing paper-based test required training for teachers and for students (Brink & Lautenbach, 2011).Planning, piloting, and providing ongoing technical support were recommended as importance means for successful substitution (Khalaf et al., 2020).Researchers also called for assessment policies to be sensitive to online modalities (Liu et al., 2015).
Many studies sought to compare the effectiveness of online and paper-based tests in assessing student performance and explored a related issue, cheating.The findings, however, were less consistent.Rivera-Mata (2021) reported that cheating during a test was related to the type of class and students rather than the face-to-face or online modalities.Others, however, reported differences in test scores between online and paper-based tests, especially when the online test was not proctored (Alessio et al., 2018), suggesting that cheating was a concern for online tests.Researchers have also identified interventions that reduced cheating during online tests, which included implementing browser lockdown and time controls during tests, designing complex questions with multiple variants, and introducing remote proctoring or artificial intelligence-based proctoring (Paredes et al., 2021).
Tests at the augmentation level.Studies at the augmentation level (n=60) employed online tests to gain better understanding of learners' current knowledge and provide additional learning opportunities.There was an emphasis on using online technologies to enable timely and often automated feedback (Herbert et al., 2019) or design tests that incorporated multimedia resources or sophisticated types of questions (Cerutti et al., 2019).These studies unanimously reported student appreciation of formative feedback from online tests (Chen & Chuang, 2012).Overall, online tests that served to augment learning allowed learners to self-assess their knowledge (Mina et al., 2011) and monitor their progress (Bayrak, 2021), which led to better academic performance later on (Swart & Meda, 2021).
Teachers were similarly appreciative of online tests primarily because they allowed frequent formative feedback to large groups of students (Balter et al., 2013).Online tests were reported to reduce marking workload through automated grading of objective (Figueroa-Canas & Sancho-Vinuesa, 2021) and open-ended questions (Nehm & Haertig, 2012).
Tests at the modification level.Studies at the modification level (n=16) used online technologies to fully embed tests in the learning process creating responsive (Yasuda et al., 2021) or dynamic (Ebadi & Rahimi, 2019) learning environments where learning progress was based on student responses to previous test items.These tests were interactive, individually customized (Kamrood et al., 2019).They were popular among learners (Yang & Qian, 2017), associated with better academic outcomes (Pezzino, 2018), and were able to guide students' further learning (Ebadi et al., 2018).
Tests at the redefinition level.Redefining tests through online technologies comprised a small group of studies (n=5).These studies used online technologies to create highly innovative test experiences for students.There were instances of having students create test questions rather than simply siting in tests (Yu & Wu, 2016), using scrambled text to assess reading comprehension rather than mastery of vocabulary (Thompson & Braude, 2016), using knowledge mapping to promote holistic understanding (Ho et al., 2018), and using social network analysis to influence peer feedback-seeking (Lin & Lai, 2013).
Although the studies designed online tests differently, they had in common an emphasis on using technologies to enable active and interactive learning that led to better learner engagement.The studies reported students' attitudinal acceptance towards the test and improved academic outcomes.However, given that each study took a unique approach to assessment design, it was premature to synthesize more nuanced research findings across the studies.

Current research into online assignments
Assignments at the substitution level.Studies categorized as using online assignment for substitution (n=4) used online technologies to deliver existing assignments and focused on comparing the efficacy between online and traditional deliveries.The assignments were all in written formats, and online technologies were used to assist with writing and composition (Cheung, 2016), marking and feedback (Grieve et al., 2016), or peer assessment (Lin, 2019).One study reported that writing assignments online improved the quality of student work (Cheung, 2016).Other studies, however, showed that students might not be attitudinally more positive towards online marking and feedback (Grieve et al., 2016), nor would they achieve better outcomes through online peer assessment of assignments (Lin, 2019).It seemed that the modality effect (efficacy of online versus traditional delivery) was associated with issues such as assessment design or perceived social presence rather than the online technology.Another study that investigated student choice of online assignments further reported that while students had opportunities to be more innovative, the majority chose traditional written assignments (Flavin, 2021).
Assignments at the augmentation level.Studies categorized as using online assignments for augmentation (n=12) also employed written assignments, or in some cases, laboratory or project work where written reports were part of the assessed project outcome.Online technologies were used to improve the marking and feedback process, creating formative learning during assignment completion.
There were differences in marks between technology-enabled marking (e.g., statement banks or automated marking) and teacher-led marking, and technology-enabled marking was mainly praised for efficiency and timely feedback (Reilly et al., 2014).Integrating online technologies also enabled students' self-assessment and peer sharing, helping students develop generic cognitive and writing skills (Hwang et al., 2015).However, the efficacy of online assignments varied by individuals: students behaved differently during assignment, and their task interpretation, goal setting, perceived autonomy, interactivity and interests were similarly different, which were associated with their assignment performance (Beckman et al., 2021).
Assignments at the modification level.Studies that sought to modify assignment tasks (n=13) used online technologies to enable peer feedback for written assignments (Chew et al., 2016) or peer assessment of student contribution in group work (Alden, 2011).
These assignments involved significant redesign of the assessment procedures.Findings suggested that enabling peer assessment improved assignment quality (Mostert & Snowball, 2013) and course completion (Formanek et al., 2017), and developed writing, critical thinking and reflection skills (Zheng et al., 2018).There were, however, different views regarding peer assessment's reliability and validity, suggesting that training before peer assessment being important for feedback quality (Chew et al., 2016).
Assignments at the redefinition level.Studies that sought to redefine assignments through online technologies (n=18) took two different approaches.The first used online technologies to help students prepare for assignments rather than using them to assess written outputs (Fernando, 2018).Students engaged in ongoing collaborative writing activities in online spaces (Alvarez et al., 2012), received regular feedback from peers and used a range of multimedia resources to prepare for components of an assignment (Fernando, 2018).There were also opportunities for students to use text-matching tools or discussion forums to improve their work and avoid academic misconduct (Buckley & Cowap, 2013).This way of technology use contributed to knowledge construction, enabled learner autonomy, facilitated reflection and critical thinking, and reduced the plagiarism ratio (Alvarez et al., 2012).
The second approach focused on digital production, where students produced digital artefacts rather than compiling written reports as the assignment output.Assignments following this approach engaged students in creating, for instance, films (Cheng & Hou, 2015), videos (Fang et al., 2021), programming scripts (Wang, 2019), and online wikis (Davies et al., 2011).They engaged students in active learning (Fang et al., 2021), facilitated deep approaches to learning (Cheng & Tsai, 2012), and developed digital skills that cannot be learnt or assessed without the use of online technologies (Nielsen et al., 2020).However, researchers highlighted the importance of pedagogy-informed design and implementation in preparing students to learn from the assignment projects (Cheng & Hou, 2015;Wang, 2019).

Current research into online skills assessments
Skills assessments at the substitution level.Studies that used online technologies to substitute existing skills assessments (n=4) were typically focused on digitalizing the existing assessment rubrics (Haack et al., 2017) or the assessment delivery (Snodgrass et al., 2014) in clinical or healthcare settings.Findings suggested that the digitalization of existing skills assessment was not inferior to traditional delivery (Kaliyadan et al., 2014) and enabled immediate and customized feedback (Haack et al., 2017).However, training was necessary for the implementation of digitalized skills assessment (Snodgrass et al., 2014).
Skills assessments at the augmentation level.Skills assessments at the augmentation level (n=8) differed from the previous group in that they explicitly sought to provide more formative learning opportunities through digitalization of existing skills assessment.
These assessments provided additional opportunities for automated, self and peer assessments of a range of work-related skills, thus allowing students to learn from the assessment and demonstrate performance improvements (Ros et al., 2021).However, studies that focused on assessing technical skills in the clinical setting reported differences between automated assessment and clinician assessment, suggesting that automated assessment was best used for formative rather than summative purposes (Abdalla et al., 2020).In addition, one study noted the difference between students' selfperceived developments and actual performance developments, highlighting again the importance of feedback and guidance during the assessment activity (Lim et al., 2020).

Skills assessments at the modification level.
Studies that modified skills assessments (n=11) were characterized as using online technologies to implement video assessment or to develop simulation-based assessment.Video assessment involved having students create videos to capture their performance in authentic learning or work environments at a distance and often asynchronously (Hay et al., 2013).Students were often allowed to practise multiple times, critique each other's performance and present their best performance (Lai et al., 2020), thus effectively transforming the assessment activity from focusing on measurement of achievement to a demonstration and exhibition of achievement (Geertshuis et al., 2022).Video assessment was an effective assessment solution to distance programs or distributed learning environments (Hay et al., 2013).In addition, it was reported as an appropriate means to developing and assessing skills-based performances especially when it is accompanied by peer assessment and feedback (Hensiek et al., 2016).
Simulation-based assessment, on the other hand, used online technologies to create assessment that simulated authentic workplace scenarios or problems.The studies reported different simulation solutions with varying levels of fidelity but had in common in using simulation or simulated cases to assess complex work-related skills (e.g., decision-making and problem-solving) that are difficult to assess in a traditional university setting (Way et al., 2021).The findings suggested that simulation-based assessment allowed students to apply what they learnt and, to some extent, allowed educators to assess work-related skills in a university setting (Craft & Ainscough, 2015).

Skills assessments at the redefinition level. Studies at the redefinition level (n=18)
focused on portfolio-based assessment through which learner progression was captured and achievement evaluated.Some studies asked students to create portfolios themselves (Bleasel et al., 2016); others used workplace performance data (e.g., students' clinical performance data in a healthcare system) as the performance portfolio (Sebok-Syer et al., 2019).These studies could all be regarded as taking a programmatic approach to assessment, measuring and supporting the ongoing development of knowledge, skills and competence over an extended period of learning.Additionally, many portfolio-based assessments had clear milestones and performance or task expectations, which created multiple formative feedback opportunities (Marinho et al., 2021).Studies reported that this assessment method created learner autonomy and led to meaningful reflections (Marinho et al., 2021), which motivated learners, facilitated self-directed learning and promoted professional development (Tinoca & Oliveira, 2013).However, researchers also noted that online tools or platforms used for portfolio-based assessment could be relatively complex, which might be a barrier for adoption if technical training and support was inadequate (Avila et al., 2016).

Discussion
To establish an overall understanding of online assessment research, which takes into consideration of different assessment approaches and different roles online technologies serve, we set out a knowledge synthesis.The 235 articles included ensured that our review was grounded in an evidence base that captured the heterogeneous body of research.In the paragraphs below, we discuss the review results in light of the current research patterns, gaps for further research, and implications for online assessment practice.

Tests as the dominant online assessment approach
Our analysis identified three major online assessment approaches.Among them, tests have been the dominant one, with nearly two thirds of articles being focused on this assessment approach.Studies on online assignments and skills assessments have been comparatively limited.This result suggests that, despite the gradual increase in research over time (Figure 2), tests are still the most common assessment practice in contemporary higher education (Boitshwarelo et al., 2017) and researchers have been using online technologies mainly to support this assessment.Noticeably, the move towards assessment for learning in higher education was reflected in the use of online tests.The tests served to not only evaluate student outcomes at the end of a semester but also to capture students' current understanding and allow students to practise learning and receive feedback on their developments.However, if we accept there is an association between assessment formats and assessed learning outcomes (Biggs, 1996) then the reliance on online tests would suggest that online assessment in higher education remains likely to be focused on cognitive domain (Krathwohl, 2002) and restricted to lower-order cognition (Liu et al., 2023).Online assignments and skills assessment that focused on higher-order cognition or learner's competence and performance beyond cognitive understanding contributed, by contrast, consistently around 30% of included articles with no evidence of increase over the review period.This suggests that, while more and more online assessment tools and platforms have been made available, a paradigm shift in terms of what intended outcomes are assessed remains yet to occur.

Technology use across online assessment approaches
Relatedly, results from the SAMR categorisation showed that, regardless of the assessment approach, the vast majority of research reported using online technologies at levels of substitution and augmentation (Figure 5), and this pattern remained unchanged throughout the review period (Figure 4).They suggest that there has not been a greater use of more sophisticated online assessment across the higher education sector although online technologies have become more and more prevalent and accessible over the years.
This finding is consistent with a recent analysis on the role of technology in transforming disciplinary education (Grainger et al., 2024).It points to the difference between using technology to enhance existing pedagogical or assessment design and using technology to drive new (or transform) pedagogy or assessment.There have been individual studies that explored learning experiences and outcomes that are unique to a specific online assessment task (Calderon & Sood, 2020).However, such thinking has not become widespread in the research literature.
There were differences in technology use across three identified approaches.More than 85% of studies on online tests used technologies to substitute or augment existing assessment.This is in contrast to research into online assignments and skills assessment, each of which had around 60% of studies on using technologies for substitution or augmentation.One possible explanation for the difference in technology use might be that tests have traditionally been used for high-stakes assessment and have been widely used for assessment in large classes.In either case, innovative technology use (e.g., modification or redefinition) in an online test would be high-risk for educators and would be regulated by strict assessment polices.Instead, replication and augmentation use of technology that draws on established practices is less effortful and likely to comply with existing policy prescriptions and curriculum arrangements, hence more likely to be adopted widely.
There were reported additional advantages of technology use at levels of modification and redefinition.While technology improved the efficiency of online assessment delivery (e.g., less time and workload) across four SAMR levels, studies at modification and redefinition levels further reported technology as enabling learner engagement with each other and with the assessment task itself.This seems to suggest that the advantage of technology lies in the way it was used pedagogically rather than being inherently to the technology itself.

Areas for future research
Our review and analysis above further identified areas of disparity in current research, which we discuss here with an aim to inform future research.One apparent disparity across the three online assessment approaches relates to academic misconduct.Academic misconduct refers to a range of unethical behavior (e.g., cheating or plagiarism) some students engage while completing academic work (Christensen Hughes & McCabe, 2006).Many reviewed studies explored academic misconduct by comparing student performance between online and offline modalities.This reflects an assumption that academic misconduct is mainly related to students and triggered by specific technological affordances.The institutional interventions therefore focused on regulating student behavior through policies and technical procedures (Paredes et al., 2021).None of the studies referred to the appropriate use of student work produced during online assessment by educators and institutions, despite increased instances of misconduct in this regard (Binder et al., 2016).In addition, student academic misconduct was explored in online tests, occasionally in online assignments (Buckley & Cowap, 2013), but not in online skills assessments.It is unlikely that certain assessment approaches are completely immune to academic misconduct, given that academic misconduct has become prevalent (Christensen Hughes & McCabe, 2006) and takes many forms (e.g., impersonation).
More research is perhaps needed to investigate academic misconduct beyond cheating during online tests.
Assessment validity refers to the extent that an assessment measures what it is designed to measure, and reliability refers to the extent that an measurement is consistent and accurate (Darr, 2023).Across the three approaches, validity and reliability issues were investigated primarily at the substitution level.This suggests that researchers have spent considerable effort examining how replicating an existing assessment in the online environment influences assessment validity and reliability.Significantly less research has been conducted into understanding whether innovative online assessment practices at modification and redefinition levels are valid and reliable.Admittedly, innovative assessment practices may inherently align more with the notion of assessment for learning (Wiliam, 2011), therefore emphasizing the developmental function over the measurement function of assessment.However, establishing the validity and reliability of online assessment at modification and redefinition levels serves to ensure the worth of these innovative practices, which in turn facilitates further adoption and dissemination.
More evidence on the efficacy of online assessment at higher levels of SAMR is therefore needed.
Methodologically, the studies reviewed often included measures of attitudinal responses to an online assessment.The result has consistently showed that, regardless of the online assessment and its SAMR level, students and teachers generally find it satisfactory and acceptable, especially when training and ongoing support are provided.Therefore, research that describes the overall acceptance of and satisfaction with online assessment may have reached maturity.Future research should include additional measures other than attitudinal response or seek to establish whether certain ways of designing and delivering a particular online assessment are likely to lead to superior satisfactory experiences than other alternatives.
In terms of the impact of online assessment, ample research has included self-reported measures on the development of knowledge and generic and disciplinary-specific skills.
In many cases, these measures were obtained right after the assessment activity and thus could be regarded as immediate impact measures.With a few exceptions (e.g., portfoliobased assessments or formative quizzes in relation to final exam scores), the impact of an online assessment on students' long term academic performance has been infrequently reported.Future research could strengthen the research base on the impact of online assessment by including objective measures collected at the end of learning or after the learning.
Finally, although many studies at the substitutional level compared the learner performance in online assessment with that in traditional assessment, there was a dearth of comparative studies.Current findings on satisfaction and immediate self-reported learning gains have portrayed online assessment as encompassing multiple promising practices, each of which has been effective.Comparative studies would have been valuable in exposing essential assessment design elements that could lead to better student learning and performance across different educational contexts.

Limitations
The SAMR model has been criticized for lack of clarity between levels of technology use.
We sought to mitigate this through adapting definitions of the four SAMR levels to the context of online assessment ahead of analysis.We further reviewed and revised the mapping result to ensure our mapping was based on collective agreement.
The database search underpinning this review was conducted in mid-2021, which meant that articles published since the second half of 2021 were not captured.We are nonetheless confident that the review provided an overall understanding of different assessment approaches and different ways of technology use in online assessment.
However, research in online assessment is heterogenous and fast growing.The application of artificial intelligence (e.g., ChatGPT) in higher education, for instance, has already stirred much attention.This development was not captured in our review but is becoming an important research topic and is influencing higher education assessment practices.
We were unable to synthesize findings according to student demographics (e.g., first year undergraduates, or non-traditional learners).In reality, students differ in their digital skills and are likely to experience online assessment differently.Additionally, various technologies have been used for online assessment.While technology accessibility was not identified as a common challenge, there were occasional references to technology being a barrier to assessment.These issues fall outside of the scope of our review but are worthy of further analysis by future reviews.

Implications for universities and educators
Our review suggests that successful implementation of online assessment requires supportive institutional policies and procedures, effective training, and time and opportunities for educators and students to understand the assessment as well as familiarize with the technological artefacts.Universities that seek to adopt online assessment should therefore carefully plan for the move to online assessment as institutional-wide change initiative rather than as implementing a digital tool as an add-on to the existing assessment processes.This certainly has significant resource, staff development, student onboarding, and curriculum change implications.
For educators, our review identifies avenues for transforming assessment practices through online technologies.Looking across the three approaches, we note that substitution was characterised as using technologies to digitalize assessment; augmentation was characterised as using technologies for additional feedback; modification was characterised as using technologies to mediate the process by which students prepare for, participate in and receive feedback from the assessment in a more interactive and effective manner, and finally; redefinition was characterised as using technologies for assessment that was often student-led and collaborative, aiming to develop holistic understanding and digital skills and to document the development of performance and achievement over time.Thus, the progressive transformation of assessment led by online technologies could be interpreted as the process by which technologies serve from being the tools for assessment delivery to becoming fully embedded in assessment design, mediating assessment tasks, feedback processes and measurement of achievement.At the same time, students are afforded more opportunities to engage with the whole assessment process, from those being assessed and receiving feedback to becoming involved in responding to feedback, assessing themselves and others, designing assessment activities, and documenting and exhibiting individual or collaborative performance during learning.This provides many opportunities for educators to change their assessment practices through online technologies.

Conclusions
This review captured current understanding of online assessment, identifying major online assessment approaches and the different functions online technologies served.
While tests have been the dominate approach, in which technologies substituted existing assessment, online assignments and skills assessments involved more innovative assessment practices.There were disparities in current understanding relating to academic misconduct, assessment validity and reliability across the three approaches, and future research should adopt a comparative design and include impact measures that are beyond user satisfaction, self-reported and short-term based.Universities and educators should implement online assessment with careful planning and design more sophisticated ways that enhances and transforms assessment practices.
and expose knowledge gaps.Three research questions (RQs) guided the review: what online assessment approaches have been reported in the research literature (RQ1); what is the role of technologies in online assessment (RQ2), and; what is the current understanding concerning different assessment approaches and different roles technologies serve (RQ3).
During the database search phase, we formulated the following search string through consultations with an experienced librarian: ((internet* OR online* OR web* OR electronic* OR computer* OR digital* OR eportfolio) AND (assess* OR evaluat* OR exam* OR test* OR assignment*)).We applied the search string to five databases in June 2021, including Web of Science, ERIC, ProQuest Education, Academic Search Complete and Education Research Complete.We further limited the search to full-text, peerreviewed articles published since 2011, written in English and in the higher education subject area.This led to 1764 records being identified and 1353 records after duplication removal.
The reviewers then extracted data to a spreadsheet.Information extracted included first author, year of publication, journal, research aim or question, methods, findings, and conclusion.Overall, the reviewed articles (N=235) covered 101 peer-reviewed journals, with Computers & Education (n=20), Assessment & Evaluation in Higher Education (n=11) and BMC Medical Education (n=10) contributing the most.The articles captured research from 39 unique countries and regions, with the USA (n=47), Australia (n=28), the UK (n=24), Taiwan (n=20), Mainland China (n=12), Spain (n=11) and Germany (n=10) dominating the sample.Various research designs have been employed, including,

Fig. 1
Fig. 1 PRISMA flow diagram for database search, records screening, and inclusion

Fig. 2 Fig. 3
Fig. 2 Number of reviewed articles by year and assessment approaches

Fig. 4
Fig. 4 Distribution of articles across the four SAMR levels between 2011 and 2021

Fig. 5
Fig. 5 Distribution of articles across the four SAMR levels between tests, assignments and skills assessment

Table 1
Adapted definition of substitution augmentation modification and redefinition in relation