education4 papersavg year 2026quality 7/5weak evidence

Regarding competence measurement more broadly, the heterogeneity of outcome instruments across the 26 included studies constitutes a fundamental challenge for synthesis. The majority of studies assess

Research gap analysis derived from 4 education papers in our local library.

The gap

Regarding competence measurement more broadly, the heterogeneity of outcome instruments across the 26 included studies constitutes a fundamental challenge for synthesis. The majority of studies assess professional competence through non- va

Consensus across the literature

Clustered from 4 gap mentions across 4 papers via embedding cosine ≥ 0.62.

Research trend

Established — well-defined area with open sub-problems.

Supporting evidence — 4 representative gaps

  • Parent–School Perception Scale: An Examination of Its Psychometric Properties (2026) · doi

    Based on the findings of this study, it is recommended that the developed scale be applied in different sociocultural regions and various types of schools, and that measurement invariance analyses be conducted. This would allow testing the consistency of the factor structure across diverse samples. In addition, the high internal consistency coefficients (above .90) observed in some sub-dimensions raise the possibility of item redundancy. Therefore, future studies may focus on item reduction analyses and the development of a short-form version of the scale. To further examine the structural relationships among the dimensions of the scale, studies based on structural equation modeling are also recommended. Moreover, investigating the relationships between scale scores and variables such as academic achievement, school engagement, student well- being, and parental involvement would strengthen evidence for criterion-related validity. Finally, developing teacher and student forms of the scale and conducting comparative studies based on multiple data sources would contribute to a more holistic evaluation of school climate and stakeholder perceptions. In addition, test–retest reliability studies are recommended to assess the temporal stability of the scale. This would provide evidence not only for internal consistency but also for stability over time. Conducting analyses based on Item Response Theory (IRT) would also make a significant contribution by allowing a more detailed examination of item functioning; in particular, analyzing item discrimination and difficulty parameters would reveal how measurement precision varies across different score levels. Furthermore, performing Differential Item Functioning (DIF) analyses across groups with different socioeconomic backgrounds would provide important evidence regarding the fairness of the scale. To examine predictive validity, the direct and indirect effects of dimensions such as guidance services and teacher perceptions on students’ academic performance or school engagement can be tested through structural equation modeling. Lastly, to enable a deeper interpretation of quantitative findings, collecting qualitative data through mixed-methods designs and conducting parent interviews would strengthen the contextual meaning of the factors and make the cultural foundations of the measured construct more visible.

    Keywords: scale item based analyses recommended different consistency across dimensions structural school evidence conducting measurement addition
  • The relationship between patterns of artificial intelligence use, academic resilience, and burnout among graduate students in special education departments at Saudi universities (2026) · doi

    Despite the scientific and practical contributions of this study, it is subject to a number of methodological limitations that should be considered when interpreting and generalizing its findings. The cross-sectional design limits the ability to draw causal conclusions; the observed correlational relationships do not prove direct causality but may reflect mediating variables that were not measured. The reli- ance on self-report instruments exposes the results to socially desir- able response biases, while the convenience sample limits the representativeness of the findings and their generalizability. Furthermore, the sample is strictly confined to graduate students in special education departments, which restricts the generalizability of the findings to other humanities or applied science disciplines. For the “AI application usage patterns Scale,” face validity and internal consistency were verified. Given the novelty of the scale and the sample size, it is recommended to conduct factor analyses (EFA/ CFA) in future studies to enhance the evidence of its construct validity. The study did not address important contextual variables such as family support, economic status, and the nature of the relation- ship with the academic advisor, which may influence the studied relationships. The measurement of burnout as a general construct without detailed differentiation may hide important variations. The developed scale needs further verification across diverse contexts. The study’s focus on the Saudi context limits its generalizability to other cultural settings. These limitations call for conducting longi- tudinal studies to examine causal relationships across program stages, and experimental studies to test the effectiveness of specific interventions. Mixed-methods approaches should be used to pro- vide a deeper understanding of the lived experiences, and samples should be expanded to include diverse specializations and regions. It is also recommended to examine mediating and moderating vari- ables, conduct cross-cultural comparative studies, and develop more precise measurement tools that distinguish between different types of AI use. Future research should also examine the long-term effects on research and critical thinking skills, explore factors that facilitate or hinder the effective use of these technologies, and use advanced ana- lytical methods such as structural equation modeling to uncover more nuanced patterns of relationships between variables. Future research should employ longitudinal designs to track within-individual changes in AI usage, resilience, and burnout trajec- tories across the graduate program lifecycle. Additionally, randomized controlled trials examining the efficacy of structured AI training inter- ventions on burnout reduction would provide causal evidence that the present cross-sectional design cannot supply.

    Keywords: relationships cross limits causal variables sample generalizability scale future burnout across examine limitations sectional design
  • Creative Self-Regulation Scale: Adaptation, Validity, and Reliability Study (2026) · doi

    • • Sample and Cross-Validation: In this study, both Exploratory Factor Analysis and Confirmatory Factor Analysis were conducted on the same sample group to determine construct validity. Because the sample size does not allow the data to be split, this situation, which was necessarily preferred, may increase the risk that the obtained factor structure is specific to the dataset. To minimize the potential impact of this situation on the generalizability of the results, cross-validation analyses with different sample groups can be conducted in future studies. Criterion Validity: Although the study presents strong evidence regarding the internal structure of the scale, criterion validity analyses examining the relationship between scale scores and other tools measuring similar constructs or external criteria such as academic success have not been conducted. This limitation is inherent to the study, and correlation analyses across different scales are planned for future research to test the scale's predictive power. 1139 Bartın University Journal of Faculty of Education, 2026(3): Research Article • Measurement Invariance: In future studies, we recommend conducting Item Functioning Variability analyses to understand the gender-based differences of the scale or to include qualitative interviews aimed at measuring the semantic perception of items between groups. • • • • Diversity of Data Collection Tools: Research data was collected solely using a self- reported Likert scale. In future studies, supporting quantitative findings with open- ended questions, qualitative interviews, or document analyses will contribute to a deeper and more multidimensional understanding of students' creative self- regulation skills. Contextual and Demographic Limitations: Although the adapted scale meets the assumption of normal distribution, in its current form, it reflects the characteristics of a specific sample group. A comparative examination of the scale's psychometric properties across students from different socio-economic levels, geographical regions, and educational stages (e.g., middle school, university) in Türkiye could enhance the tool's generalizability and test its cultural universality. Domain-Specific Applications: The scale measures a general creative self- regulation structure. However, the creativity literature argues that this skill may be domain-specific. Therefore, testing the scale by tailoring it to specific disciplines, such as science education, art, or mathematics, could yield more targeted contributions to the literature. Contextual and Demographic Limitations: Although the present sample was adequate for an initial scale adaptation study, it consisted only of ninth-grade students from a single high school in the Central Anatolian region and was selected through purposive sampling. Therefore, the findings should be regarded as preliminary evidence rather than as results directly generalizable to all student populations in Türkiye. The psychometric properties of the adapted scale may vary across different geographical regions, socioeconomic strata, school types (e.g., vocational, science, Anatolian, private schools), and age groups. For this reason, future studies should examine the scale with more diverse and representative samples in order to test the stability and generalizability of the factor structure across different educational and demographic contexts.

    Keywords: scale sample specific analyses different future factor structure across conducted validity generalizability groups test self
  • STEM–TVET integration for developing pre-service teachers’ professional competence: a systematic literature review (2026) · doi

    Regarding competence measurement more broadly, the heterogeneity of outcome instruments across the 26 included studies constitutes a fundamental challenge for synthesis. The majority of studies assess professional competence through non- validated self-report scales, with only a minority employing psychometrically validated instruments (Kelley et al., 2020; Mahanal et al., 2022; Vieira et al., 2023). This measurement heterogeneity has three specific consequences for interpreting the evidence base: it prevents quantitative aggregation of effect sizes across studies, it makes it impossible to determine whether similar constructs — such as “digital studies reporting fact competence” or “pedagogical self-efficacy” — are measuring the same underlying dimensions, and it creates a systematic risk of social desirability inflation in self-report responses, particularly in studies where participants are aware that the intervention is being evaluated. Future research in this domain should prioritize the development and cross-study adoption of validated, multidimensional instruments capable of capturing the interconnected competence outcomes identified in this review. in 4.3 Implications for theory, policy, and practice integration succeeds structural determinants of competence The conceptual model proposed in this review extends existing integrated STEM frameworks (Roehrig et al., 2021) by explicitly incorporating vocational alignment and institutional moderation formation. By as articulating enabling and constraining conditions as moderating variables, the model offers a more comprehensive explanation of why in some contexts and remains fragmented in others. From a practical perspective, teacher education single-course interventions toward sustained, program-level frameworks that align learning outcomes, assessment practices, and professional and development interdisciplinary coordination are not peripheral conditions but central determinants of integration effectiveness. Without such systemic alignment, STEM–TVET integration risks remaining episodic and dependent on individual initiative rather than becoming teacher education systems. programs must move

    Keywords: competence instruments validated self integration measurement heterogeneity across professional report development outcomes review determinants model

Explore this gap further

Search “Regarding competence measurement more broadly, the heterogeneity of outcome instruments across the 26 included studies constitutes a fundamental challenge for synthesis. The majority of studies assess” across open scholarly engines for the latest related literature.

Working on this gap? Publish with us.

Science AI Journal reviews manuscripts in under 15 minutes with 8 specialised AI reviewers calibrated on 23,000+ real peer reviews. Open access, CC BY 4.0.

Related gaps in Education

Command palette

Jump anywhere, run any action.