General Information

ISSN: 1793-8201 (Print), 2972-4511 (Online)
Abbreviated Title: Int. J. Comput. Theory Eng.
Frequency: Quarterly
DOI: 10.7763/IJCTE
Editor-in-Chief: Prof. Mehmet Sahinoglu
Associate Editor-in-Chief: Assoc. Prof. Alberto Arteta, Assoc. Prof. Engin Maşazade
Managing Editor: Ms. Cecilia Xie
Abstracting/Indexing: Scopus (Since 2022), INSPEC (IET), CNKI, Google Scholar, EBSCO, etc.
Average Days from Submission to Acceptance: 192 days
APC: 800 USD
E-mail: editor@ijcte.org
Journal Metrics:

3.1

2023CiteScore

53rd percentile

Powered by

Article Metrics in Dimensions

HOME > Archive > 2025 > Volume 17 Number 1 (2025) >

IJCTE 2025 Vol.17(1): 28-35
DOI: 10.7763/IJCTE.2025.V17.1366

Enhancing Tag Recommendation Precision on Stack Overflow Data Warehouse: An Integrated Approach Combining Numeric Attributes, Feature Extraction Techniques, and Multiple Machine Learning Algorithms

Seyede Sanaz. Jedari Jafari* and Yousef. Emdadi

School of Computer Science, University of Windsor, Windsor, Canada
Email: jedarij@uwindsor.ca (S.S.J.J.); yousef.emdadi@gmail.com (Y.E.)
^*Corresponding author

Manuscript received September 11, 2024; revised October 18, 2024; accepted December 4, 2024; published March 6, 2025

Abstract—Building upon our previous work on extracting and analyzing Stack Overflow data to uncover trends in programming languages, community contributions, and talent availability, this research investigates the impact of numeric attributes on tag recommendation. Utilizing the Stack Overflow Data Warehouse System developed in our prior study, we conduct a comprehensive analysis of multiple Machine Learning (ML) algorithms to evaluate their effectiveness in recommending tags based on an integration of specific numeric attributes with feature extraction techniques. The methodology involves extracting relevant data, preprocessing it, and applying Term Frequency-Inverse Document Frequency (TF-IDF) as a feature extraction technique alongside diverse ML algorithms, including Support Vector Machines (SVM), Gradient Boosting, Random Forest, and Decision Tree, to assess their performance. Our results indicate that this combination improves evaluation metrics, including F1 Score, Recall, and Precision, with a particularly significant influence on the Precision of tag recommendations, providing insights into the optimization of tagging systems on Q&A platforms. Future research will focus on integrating advanced models and refining data preprocessing techniques to further enhance tag prediction accuracy. This study extends the application of the Stack Overflow Data Warehouse System and contributes to the improvement of tag recommendation mechanisms in online technical communities.

Keywords—Stack Overflow Data Warehouse, tag recommendation, numeric attributes, TF-IDF, Machine Learning (ML), Support Vector Machines (SVM), gradient boosting

[PDF]

Cite: Seyede Sanaz. Jedari Jafari and Yousef. Emdadi, "Enhancing Tag Recommendation Precision on Stack Overflow Data Warehouse: An Integrated Approach Combining Numeric Attributes, Feature Extraction Techniques, and Multiple Machine Learning Algorithms," International Journal of Computer Theory and Engineering, vol. 17, no. 1, pp. 28-35, 2025.

Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

PREVIOUS PAPER

A Comparison of 2D and 3D CNN for Lung CT Image Tuberculosis Severity Assessment

NEXT PAPER

Exploring Effectiveness in Software Development: A Comparative Review of System Analysis and Design Methodologies

Home

About IJCTE

Editorial Board

Author Guidelines

Reviewers Guidelines

Current issue

Archive

Special Issue

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Enhancing Tag Recommendation Precision on Stack Overflow Data Warehouse: An Integrated Approach Combining Numeric Attributes, Feature Extraction Techniques, and Multiple Machine Learning Algorithms