Computer Science · 115 papers

Theory gaps in Computer Science

132 open theory research questions in Computer Sciencegaps in the underlying theory, mechanisms, or explanations — extracted from 115 papers in our local library. Below are representative open questions, each linked to the paper that raised it.

Representative open questions

Showing 30 of 132 — one per source paper, highest-quality first.

  • Causal K-means clustering (2026) · doi

    The analysis assumes bounded support (∥μ∥∞ ≤ B < ∞ a.s.) for the high-probability bounds in Theorem 3.1, but real causal data with heavy-tailed covariate distributions may violate this; the paper does not extend results to unbounded or heavy-tailed settings.

  • AN ITERATIVE GLMM–XGBOOST ALGORITHM WITH GROUP-AWARE CONDITIONAL PERMUTATION IMPORTANCE FOR EXPLAINING MULTILEVEL ITEM RESPONSE DATA (2026) · doi

    The paper introduces group-aware conditional permutation importance for explaining GLMM–XGBoost predictions but does not provide guidance on interpreting interaction effects identified through this approach when predictors operate at different hierarchical levels (person-, cluster-, item-level), particularly in cross-level interactions.

  • Numerical Method for Nonlinear Kolmogorov PDEs via Sensitivity Analysis (2026) · doi

    The paper extends Proposition 5.8 from [16] by relaxing the volatility-only setting (b = 0) to include drift and strengthening convergence from weak convergence to τp-topology for polynomially growing test functions. However, the specific convergence rates and stability properties under joint drift-volatility perturbations in the τp-topology have not been quantitatively characterized, particularly for uncertainty sets with ε approaching λmin(σo).

  • Ento-Linguistics: Language, Ambiguity, and Scientific Communication in Entomology: How Terminology Networks Shape Understanding of Insect Biology (And Vice-Versa) (2026) · doi

    Cross-domain concept mapping (Conceptual Mapping module) identifies relationships between the six ento-linguistic domains but does not quantify how terminology ambiguity in one domain (e.g., 'caste' in social organization) propagates and constrains interpretation in another domain (e.g., 'caste' in evolutionary genetics).

  • Large language model based machine translation for universal multilingual understanding and translation quality enhancement (2026) · doi

    The comparative analysis shows GPT-4 outperforms other models consistently across language pairs (En-De, En-Cs, En-Zh, En-Ru, De-En, Cs-En, Zh-En, Ru-En), but does not investigate the specific linguistic phenomena or grammatical structures where GPT-4 advantages emerge, particularly for morphologically complex languages like Russian and Czech.

  • High-Dimensional Perception with the Double Machine Learning Lens Model (2026) · doi

    The paper confirms that language model embeddings outperform LIWC but does not systematically investigate the specific linguistic features (word co-occurrence patterns, contextual semantics, misspellings, grammatical errors) that embeddings capture better than dictionary-based methods when applied to perception prediction tasks in psychological datasets.

  • Quantum Information Framework for Neural Network Generalization: A Comprehensive Experimental Analysis (2026) · doi

    The framework evaluates generalization on test sets but does not investigate the relationship between quantum information metrics (von Neumann entropy, effective rank) and overfitting detection, or whether these metrics can predict generalization gaps before evaluation on held-out test data.

  • Federated learning for fair autism spectrum disorder screening across age-heterogeneous populations (2026) · doi

    Cross-age generalization evaluation revealed accuracy drops when transferring models between children and adult populations (Figure 8a), but the underlying mechanisms causing this performance degradation in federated autism screening models remain unexplored. Specifically, whether degradation stems from feature distribution shifts, age-specific behavioral patterns in ASD presentation, or model architecture limitations needs systematic investigation.

  • Low-sample supervised fault diagnosis for fixed-wing UAVs based on multi-scale adaptive state-aware sequence learning (2026) · doi

    The ablation study demonstrates that MHSA module importance varies dramatically with data scarcity (6.56% performance drop with 90 samples on binary classification, versus 30.63% drop on multi-classification with 90 samples), but the interaction mechanisms between MHSA, Mamba, and MSTFE components across different sample sizes and fault severity levels remain unexplored for fixed-wing UAV fault detection.

  • Comparative analysis of deep learning algorithms for rolling element bearing fault classification under variable loads and speeds (2026) · doi

    Vision Transformer (ViT) demonstrated unpredictable and non-monotonic performance degradation across noise levels (0.830 baseline, 0.690 at 5 dB, 0.744 at 3 dB, 0.733 at 1 dB SNR) compared to CNN-based architectures, with significant accuracy fluctuation at intermediate SNR conditions. Research is needed to identify why transformer-based architectures exhibit this erratic behavior in rolling element bearing fault detection under noisy conditions and how to stabilize their performance.

  • El marco regulatorio europeo de la inteligencia artificial y su impacto en el sistema judicial español. (2026) · doi

    The paper identifies that Spain's modernization phase has not yet addressed fundamental organizational and substantive challenges in judicial administration through AI integration. A critical gap exists in defining which categories of simple mechanical judicial acts and routine procedural tasks should be automated versus which require human judicial interpretation, requiring empirical analysis of judicial workflows across different jurisdictional orders.

  • Probabilistic analysis of scalogram ridges in signal processing (2026) · doi

    The proof of upper hemicontinuity for the set-valued function sY(t; ω) relies on Berge's maximum theorem applied to a compact interval [0, S(ω)], but the paper does not establish conditions under which this compactness assumption holds for arbitrary noisy signals or provide explicit bounds on S(ω) in terms of signal characteristics, noise covariance, or wavelet properties.

  • LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer (2026) · doi

    The failure case analysis reveals that TransFuzz cannot distinguish compiler-level optimization artifacts from genuine functional bugs in gradient computation, as demonstrated by the torch.compile flex_attention case where numerical differences from compiler optimizations were misclassified. Developing domain-specific oracles that incorporate expert-level reasoning about compiler internals and floating-point precision behavior is needed for silent bug fuzzing in compiled deep learning operations.

  • Quantum-SpinalNet: a hybrid deep learning approach for mammographic breast cancer detection (2026) · doi

    The paper demonstrates that integrating biologically-inspired SpinalNet with quantum-inspired DQNN achieved synergistic performance improvements (93.8% accuracy, Dice = 0.89), but does not experimentally isolate whether these gains derive from reduced overfitting via layer-wise modular processing or from improved feature abstraction through quantum probabilistic reasoning.

  • Automated design of heuristics for resource-constrained project scheduling problem via regression algorithms (2026) · doi

    The regression-based heuristics demonstrated superior generalization from small to large project instances, but the mechanisms enabling this cross-scale knowledge transfer in resource-constrained project scheduling remain unexplored. Future work should investigate which regression algorithm features (feature engineering, model architecture, training data characteristics) are responsible for the generalization advantage over genetic programming approaches across Multi-projects, Large-projects/P, and Large-projects/SP datasets.

  • From unstructured text to structured reasoning: a hybrid knowledge graph for Indonesian sentencing analysis (2026) · doi

    While the paper demonstrates that objective entities (F1 > 90%) correlate with standardized formats and interpretive entities (F1 < 80%) with legal reasoning variation, it does not propose or validate methods to explicitly model this epistemological distinction in the hybrid knowledge graph structure for improved entity disambiguation.

  • Phishing in the age of distributed intelligence: taxonomies, detection strategies, and the emerging role of federated learning (2026) · doi

    Federated learning approaches for phishing detection introduce accuracy-computational cost trade-offs when privacy-preserving techniques (differential privacy, homomorphic encryption) are applied, but the quantitative relationship between privacy budgets and detection performance degradation in federated phishing detection systems remains inadequately characterized.

  • On the interface between linguistics, computer science and psychiatry: analyzing textual key-factors affecting BERT-based classification of schizophrenia in social media texts (2026) · doi

    Topicality effects on semantic coherence salience in schizophrenia detection remain theoretical; the paper cannot yet determine whether BERT captures deeper grammatical or coherence disruption patterns related to topic versus simply capturing differences in linguistic information density and discourse register across genres. Cross-topic coherence analysis with controlled semantic and syntactic complexity is required.

  • Predicting Employee Attrition: A Machine Learning Approach in Human Resource Analytics (2026) · doi

    The feature importance analysis reveals divergent rankings between Gradient Boosting and Random Forest models for secondary attrition predictors, with Gradient Boosting prioritizing workload variables (Overtime, Stock Option Level) while Random Forest emphasizes demographic factors (Distance from Home, Years at Company). The paper does not investigate whether these differences stem from model architecture bias or represent genuine contextual variations in attrition mechanisms across employee subgroups.

  • Cluster Pattern Analysis of Students Stress using Machine Learning Algorithms with Feature Engineering (2026) · doi

    The study mentions that institutional datasets will be distinctive and unique for counseling at different educational institutions but does not specify which stressor factors (depression, career concerns, blood pressure, self-esteem, mental health history, living conditions, study load) are institution-specific versus universally predictive of stress clusters. Domain adaptation techniques for transferring stress prediction models across different institutional contexts need to be explored.

  • Latency-efficient edge intelligence in IoT networks using knowledge distillation (2026) · doi

    The model accuracy improvement formula (Equation 15) assumes logarithmic convergence behavior, but the paper provides no theoretical justification for this functional form or empirical validation of its applicability across different knowledge distillation architectures, teacher-student model size ratios, or diverse IoT sensor modalities.

  • Interpretable machine learning-based modelling of minimum miscibility pressure in hydrocarbon gas injection processes (2026) · doi

    While the paper identifies MW_C5+ and Tc,ave_gas as the most influential variables through sensitivity analysis and SHAP analysis, the mechanistic relationship between these molecular properties and MMP behavior in the context of mass transfer and phase equilibrium during gas injection is not theoretically explained or validated experimentally.

  • An Ontology Driven Machine Learning Framework for Early Prediction in Children with Cerebral Palsy (2026) · doi

    The paper identifies that KNN and Naive Bayes performed inconsistently with particularly poor discrimination for Level 2 GMFCS cases (Naive Bayes AUC=0.50), but provides no investigation into whether ontological semantic enrichment fails to generate discriminative features for intermediate severity levels or whether these algorithms fundamentally cannot capture the encoded clinical relationships.

  • Understanding the Dynamics of Trust and Engagement in E-Commerce Recommender Systems: Trends and Influences (2026) · doi

    Current theoretical frameworks for trust and engagement in e-commerce recommender systems are predominantly based on research from China and Western markets, with African informal and community-driven recommendation networks critically underrepresented. Future work must empirically investigate how trust conceptualizations differ across African, South Asian, and Latin American e-commerce contexts to validate whether existing engagement models capture culturally diverse user interactions with recommender systems.

  • A Review on Daylighting Prediction by Using Artificial Neural Network Techniques (2026) · doi

    Current methods for determining optimal numbers of hidden layers in ANN daylighting models rely on time-consuming trial-and-error approaches; the balance between prediction accuracy and computational complexity when using two or more hidden layers versus single hidden layer architectures needs systematic investigation.

  • Misspellings in natural language processing: A survey of recent literature (2026) · doi

    Few studies explicitly investigate the interaction between misspelling robustness and pre-trained language model size or architecture type (e.g., BERT vs. RoBERTa vs. GPT variants); the relationship between model capacity and resilience to spelling perturbations requires systematic investigation.

  • Deep Learning Based Fish Species and Freshness Detection Using Convolutional Neural Networks (2026) · doi

    The three freshness categories (Fresh, Medium, Spoiled) used for CNN classification lack correlation with objective freshness metrics such as bacterial load, pH levels, or volatile organic compound concentrations; validation against biochemical freshness indicators is absent.

  • Women and medicine (2010) · doi

    The paper notes that women are 'under-represented in academic medicine' which leads to under-representation in medical leadership, but does not establish the quantitative threshold or specific pipeline analysis identifying at which career stage women disproportionately exit academic medicine pathways.

  • Workforce Analytics for Manufacturing: Predicting Employee Job Satisfaction via Explainable Machine Learning and SHAP (2026) · doi

    The finding that demographic factors (Age, Experience, Type of Job) show minimal SHAP relevance (values near zero) contradicts established HR literature, yet the paper does not investigate why these traditionally important variables failed to influence the job satisfaction model or whether this reflects dataset composition bias, feature scaling issues, or genuine low predictive power in this manufacturing context.

  • Unified URL and QR Based Phishing Detection Framework (2026) · doi

    While ethical considerations address data protection and transparency, the paper does not evaluate adversarial robustness: how the model performs against evasion attacks (e.g., homograph attacks, obfuscated URLs, dynamically generated QR codes with steganographic encoding).

Working on one of these gaps? Publish with us.

Science AI Journal reviews manuscripts in under 15 minutes with 8 specialised AI reviewers calibrated on 23,000+ real peer reviews. Open access, CC BY 4.0.

Other gap types in Computer Science

Command palette

Jump anywhere, run any action.