Methodology gaps in Computer Science
655 open methodology research questions in Computer Science — gaps in how studies are designed, measured, or analysed — extracted from 392 papers in our local library. Below are representative open questions, each linked to the paper that raised it.
Representative open questions
Showing 30 of 655 — one per source paper, highest-quality first.
- Federated learning for privacy-preserving skin cancer classification using deep neural networks (2026) · doi
The federated learning approach using the ring topology outperforms centralized training on the ISIC dataset, but the effect of different topologies (e.g., FedAvg, FedProx) is not fully explored.
- Causal K-means clustering (2026) · doi
The causal K-means clustering method requires separate independent samples for constructing the nuisance parameter estimator μ̂ to satisfy Assumption A2; the paper does not explore whether the sample-splitting requirement can be relaxed or how performance degrades when a single sample is used for both clustering and nuisance estimation.
- Hybrid Deep Model for Pain Intensity Classification Using Fused ECG, EMG, and GSR Signals (2026) · doi
The study demonstrates that signal sequence ordering (GSR–ECG–EMG vs. ECG–GSR–EMG) significantly impacts hybrid BiLSTM-MHAT-CNN model performance, but lacks systematic investigation of all possible permutations and theoretical justification for optimal ordering. Future work should enumerate and evaluate all six permutations of the three physiological signals to establish evidence-based signal sequencing guidelines for multimodal pain intensity classification.
- Turbulence closure in Reynolds-averaged Navier–Stokes and flow inference around a cylinder using physics-informed neural networks and sparse experimental data (2026) · doi
The wall-damping function for the Reynolds-force neural network is currently defined using a cubic polynomial with a fixed threshold d* = 0.1D; the sensitivity of turbulence closure predictions to alternative damping function formulations and threshold values across the Reynolds number range Re ≈ 300−300,000 has not been systematically investigated.
- Risk, Data, Alignment: Making Credit Scoring Work in Kenya (2026) · doi
The paper shows that data science teams reduce hundreds or thousands of engineered features to 10-20 final features using correlation analysis, but the authors note that proprietary and secret feature selection techniques emerge from non-scalable manual checks where correlation assumptions fail. There is no systematic methodology documented for replicating or validating these manual corrections across different fintech startups or geographic contexts in Kenya.
- AN ITERATIVE GLMM–XGBOOST ALGORITHM WITH GROUP-AWARE CONDITIONAL PERMUTATION IMPORTANCE FOR EXPLAINING MULTILEVEL ITEM RESPONSE DATA (2026) · doi
The permutation importance analysis (Panel C, Table 2) demonstrates that standalone XGBoost's rank correlation with true feature importance degrades sharply as ICC increases (from 0.798 to 0.472 at large sample size), but the paper does not investigate the specific mechanisms causing this degradation or propose modifications to XGBoost's importance calculation for clustered data structures.
- FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review (2026) · doi
Vision Transformers (ViTs) and recurrent neural networks (RNNs) lack mature FPGA implementations despite their superior performance in remote sensing tasks like temporal change monitoring and flight-path analysis. Current FPGA toolchains (FINN, Vitis AI) do not support transformer and recurrent architectures, including their quantized and pruned variants, creating a barrier to deploying these modern architectures on edge platforms.
- Blockchain-integrated machine learning framework for transparent smart contract vulnerability detection (2026) · doi
XGBoost and LightGBM showed lower recall (~0.82) compared to Random Forest and CatBoost (~0.77-0.88) on smart contract vulnerability detection due to overfitting on dominant features and sensitivity to hyperparameter tuning. The specific hyperparameter configurations that minimize this performance gap for gradient boosting methods on blockchain vulnerability datasets have not been systematically explored.
- A novel deep learning approach for accurate and efficient design of LNOI power splitters (2026) · doi
The tolerance analysis for LNOI power splitter designs only evaluated a worst-case scenario with simultaneous +0.05 μm perturbations across all six geometric parameters; independent statistical analysis of individual parameter deviations and their combined effects on output power ratios across asymmetric tolerance distributions has not been performed.
- Can deep learning-based segmentation and classification improve the detection of renal cortical abnormalities? (2026) · doi
The paper does not investigate how CLAHE preprocessing parameters (clip limit, tile grid size) were optimized or whether these parameters are transferable across different DMSA imaging equipment and acquisition protocols. Future work should systematically evaluate CLAHE preprocessing robustness for DenseNet205 classification across DMSA systems with varying imaging parameters and reconstruction algorithms.
- Integration of biomedical imaging and sensing technologies with AI-IoT for Materiovigilance and predictive modeling of stillbirth risk in maternal-fetal health monitoring (2026) · doi
AI prediction models using ensemble methods (XGBoost, gradient boosting, random forest) demonstrate variable sensitivity and specificity across studies, with one XGBoost model showing only 25% sensitivity. Feature selection strategies and model optimization techniques specific to stillbirth prediction datasets remain inadequately developed.
- Proteogenomic decoding of chemotherapy resistance in patients with triple-negative breast cancer (2026) · doi
Logistic regression was used to develop a predictive model for non-pathological complete response (non-pCR) using selected proteogenomic biomarkers, but the paper does not report cross-validation performance metrics, feature importance rankings, or comparison with machine learning approaches (random forest, gradient boosting, neural networks) that might improve predictive accuracy for chemotherapy resistance in TNBC.
- Umjetna inteligencija: od kritičkih promišljanja do njezine primjene u kaznenom pravosuđu (2026) · doi
The Italian Delia algorithm processes over 1.5 million variables combining geographic crime data with witness interviews and CCTV footage, but no comparative analysis exists documenting which specific variables contribute most to prediction accuracy or how the system performs across different geographic regions and crime categories.
- Prediction of sedimentation concentration profiles in inclined suspension systems: A data-driven neural network framework (2026) · doi
The study employed separate neural networks trained for each inclination angle to preserve regime-specific sedimentation characteristics; development of a unified ANN architecture that integrates inclination angle as an input parameter across multiple angles in inclined suspension systems has not been explored.
- Ento-Linguistics: Language, Ambiguity, and Scientific Communication in Entomology: How Terminology Networks Shape Understanding of Insect Biology (And Vice-Versa) (2026) · doi
The temporal network evolution Δ𝐺(t) framework is defined mathematically but no diachronic analysis is presented showing how entomological terminology networks have evolved over specific time periods (e.g., pre-1990 versus post-2010) or how conceptual mapping strength changes following major taxonomic revisions.
- Large language model based machine translation for universal multilingual understanding and translation quality enhancement (2026) · doi
The evaluation of LLM-based machine translation systems relies on varying preprocessing pipelines, tokenization schemes, and evaluation protocols across studies, making it impossible to establish directly comparable quantitative benchmarks. A standardized evaluation framework needs to be developed that harmonizes these methodological differences for consistent cross-model comparison in large language model machine translation.
- Environmental, social, and governance sustainability: an AI-centric approach driving data standardization and automation (2026) · doi
The study demonstrates model performance at the 10th training epoch but does not investigate whether continued training beyond 10 epochs improves the social model's accuracy or establish convergence criteria and early stopping mechanisms for the environmental, social, and governance category models.
- Deep learning-based multimodal prediction of chronic kidney disease stage (2026) · doi
The improved BP neural network model achieves 93.2% accuracy and BERT achieves 95.1% accuracy on their own, but the paper does not provide an ablation study showing which specific architectural improvements to the BP network or which BERT fine-tuning strategies contributed most to individual model performance before multimodal fusion.
- Leveraging machine learning to enhance aerosol classification using Single-Particle Mass Spectrometry (2026) · doi
The study identifies feldspar discrimination as a key challenge due to chemical overlap between Na-feldspar, K-feldspar, feldspar cSA, and feldspar cSOA subtypes. Advanced feature engineering or architectural modifications to single-particle mass spectrometry models must be developed to capture subtle spectral differences that distinguish these compositionally similar mineral species.
- High-Dimensional Perception with the Double Machine Learning Lens Model (2026) · doi
The paper identifies significant performance variability across different language embedding models (miniLM achieving ~80% of NV-Embed's accuracy) but does not systematically investigate how embedding architectures, parameter counts, training objectives, and latent attention mechanisms influence their capacity to identify psychologically relevant constructs in high-dimensional perception tasks.
- Quantum Information Framework for Neural Network Generalization: A Comprehensive Experimental Analysis (2026) · doi
The quantum analyzer computes density matrices from activation patterns using a single method (likely outer product construction); alternative density matrix formulations and their impact on von Neumann entropy and purity measurements for neural network activations remain unexplored, creating ambiguity about metric robustness.
- Federated learning for fair autism spectrum disorder screening across age-heterogeneous populations (2026) · doi
The study demonstrates performance saturation when advancing from personalized FedAvg to more complex personalization strategies (Advanced personalized FL), with diminishing returns (AUC improvement of +0.004, p=0.62). The optimal balance between personalization complexity and performance gains in federated learning for autism spectrum disorder screening across age-heterogeneous populations requires investigation to guide practical deployment decisions.
- Large Language Models for Combinatorial Optimization: A Systematic Review (2026) · doi
The integration of Large Language Models with automated planning and scheduling (APS) systems requires investigation of how LLMs can effectively contribute to constraint specification, plan generation, and solution validation within APS frameworks, as noted in reference [166] on prospects of incorporating LLMs in APS.
- Remote sensing image enhancement and water eutrophication prediction based on atmospheric-water multimodal information fusion (2026) · doi
The temporal window for atmospheric feature fusion is set to 30 days (Mean-30 and LinearTrend-30 baselines); the sensitivity of the cross-attention mechanism to different temporal window lengths (e.g., 7, 14, 60 days) for atmospheric-water multimodal fusion and its effect on long-term eutrophication prediction accuracy has not been systematically investigated.
- Artificial intelligence in acute and critical care: current challenges and strategic solutions (2026) · doi
Fine-tuning combined with retrieval-augmented generation for error-detection and self-correction capabilities in AI systems for acute and critical care requires systematic evaluation and validation. The paper identifies this as a needed approach but does not specify implementation protocols, benchmark datasets, or performance metrics for testing these combined techniques in complex clinical scenarios.
- Explainable machine learning for tracking spatial variation in leaf chlorophyll fluorescence within temperate deciduous forest canopies (2026) · doi
While SHAP was employed to improve interpretability, the mechanistic understanding of how spectral reflectance drives chlorophyll fluorescence predictions remains limited. Future work should integrate the Random Forest model with process-based models (e.g., radiative transfer models) following the framework of Wolanin et al. (2019) to bridge predictive accuracy with photosynthetic mechanism understanding.
- Low-sample supervised fault diagnosis for fixed-wing UAVs based on multi-scale adaptive state-aware sequence learning (2026) · doi
The model exhibits systematic confusion between mild faults (d2 = 0.4 to 0.9) and normal operational states in multi-classification tasks, with F1-scores as low as 0.4538 for d2 = 0.4 on July 21 flight data. Future improvements should specifically implement contrastive learning or fine-grained feature extraction techniques to amplify interclass differences and resolve misclassifications between normal and mild fault states in UAV fault diagnosis.
- Research on a strongly generalizable fault diagnosis method based on adversarial transfer learning (2026) · doi
Conditions 12 and 13 in the target reactor exhibit highly similar multivariate trend characteristics across parameters 1, 2, and 3, making them difficult to differentiate. The HDAL adversarial transfer learning model achieved only 92.556% accuracy on these conditions due to data quality issues, requiring investigation into feature engineering or data preprocessing techniques specifically designed to enhance discriminability between similar fault conditions in cross-reactor transfer diagnosis.
- Bayesian optimization for uncertainty-aware prediction of rainfall-induced deformation in embankment dams (2026) · doi
The multiplicative uncertainty-calibrated score (UCS) formulation achieves near-nominal coverage but yields comparatively inferior final UCS values relative to weighted formulations (e.g., 0.7EC + 0.3CRPS). Future work should investigate why this trade-off exists and whether hybrid or adaptive UCS formulations can simultaneously optimize convergence speed and asymptotic uncertainty sharpness in Bayesian optimization of prediction intervals for dam deformation.
- Simulation-based inference captures non-Markovian effects as exemplified in protein production kinetics through cell division (2026) · doi
The interplay between Gillespie algorithm simulation accuracy for non-Markovian processes and SBI inference fidelity has not been explicitly characterized. Specifically, how simulation discretization errors and rejection sampling in non-Markovian Gillespie variants (Masuda & Rocha, 2018) propagate through neural network-based inference requires systematic investigation.
Working on one of these gaps? Publish with us.
Science AI Journal reviews manuscripts in under 15 minutes with 8 specialised AI reviewers calibrated on 23,000+ real peer reviews. Open access, CC BY 4.0.