Skip to content

Latest commit

 

History

History
90 lines (66 loc) · 4.02 KB

README.md

File metadata and controls

90 lines (66 loc) · 4.02 KB

Issuex

This lightweight tool is designed to extract comprehensive information from Bugzilla repositories, automating the process of querying Bugzilla’s REST API, ensuring consistent and thorough data collection across different instances of Bugzilla. Along with this tool, this repository helps the research community by providing a robust dataset with all the information about key-projects of Eclipse and Mozilla. This repository contains a Command Line Interface (CLI) that facilitates the extraction of issue reports in a simple and easy-to-understand manner.

Features

  • Automated Data Extraction: Automatically fetches detailed bug report information, comments, attachments, and historical changes.
  • Customizable Queries: Specify repository, classification, product, and component of interest. Filter issues based on status and resolution.
  • Ease of Use: Simple commands to run and configure the tool.
  • Error Handling: Handles API rate limits and retries failed requests up to 3 times to ensure data consistency.

Dataset description

The dataset generated by this tool includes:

  • Detailed information from Bugzilla repositories for Eclipse and Mozilla projects.
  • Historical data from the inception of Bugzilla usage up to November 2024.
  • Structured directories for each project, product, and component, with CSV files containing all bug details.

Specifically, the fields provided for each bug are: Issue URL, ID, Alias, Classification, Component, Product, Version, Platform, Op sys, Status, Resolution, Depends on, Dupe of, Blocks, Groups, Flags, Severity, Priority, Deadline, Target Milestone, Creator, Creator Detail, Creation time, Assigned to, Assigned to detail, CC, CC detail, Is CC accessible, Is confirmed, Is open, Is creator accessible, Summary, Description, URL, Whiteboard, Keywords, See also, Last change time, QA contact,History/Activity Log, Comments, Attachments.

Our dataset contains the bug reported for 9 popular products/components from Eclipse and the Core component of Mozilla. Below, we show the selected products along with the number of reports obtain for each one of them.

Repository Product / Component Number of reports
Eclipse Platform 122.497
Eclipse JDT 63.266
Eclipse CDT 23.371
Eclipse BIRT 23.308
Eclipse PDE 17.639
Eclipse Equinox 14.559
Eclipse Mylyn 13.906
Eclipse TPTP 10.579
Eclipse Papyrus 13.253
Mozilla Core 522.355
Total number of issues: 823.733

Dataset can be found on Zenodo:

Instalation

1. Clone this repository

    $ git clone https://github.com/lNoelia/Issuex
    $ cd Issuex

2. Using a virtual environment

python -m venv venv
source venv/bin/activate

3. Create environment variables

cp example.env .env

After this step, you should edit this file and specify the repository and project you want to obtain the issues from.

4. Install dependencies

pip install -r requirements.txt

5. Install CLI

pip install -e ./

Using Issuex CLI

Run the tool

Run the tool and specify the status and resolution of the issues to be extracted.

issuex run

Optionally, you can use --from-date "YYYY-MM-DD" to obtain the issues that were created from that date until today.

Execute the tool with default settings

The default setting will automatically get all the issues from the given repository and the classification, product and component specified in the configuration file.

issuex run:default