Skip to content

Commit

Permalink
Converting back to notebooks for documentation.
Browse files Browse the repository at this point in the history
  • Loading branch information
gvanhoy committed Jul 25, 2023
1 parent d5ff648 commit 495b949
Show file tree
Hide file tree
Showing 3 changed files with 304 additions and 173 deletions.
303 changes: 303 additions & 0 deletions examples/00_example_sig53_dataset.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,303 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Example 00 - The Official Sig53 Dataset\n",
"This notebook walks through an example of how the official Sig53 dataset can be instantiated and analyzed.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Import Libraries\n",
"First, import all the necessary public libraries as well as a few classes from the `torchsig` toolkit."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from torchsig.utils.writer import DatasetCreator\n",
"from torchsig.utils.visualize import IQVisualizer, SpectrogramVisualizer\n",
"from torchsig.datasets.modulations import ModulationsDataset\n",
"from torchsig.datasets.sig53 import Sig53\n",
"from torchsig.utils.dataset import SignalDataset\n",
"from torchsig.datasets import conf\n",
"from torch.utils.data import DataLoader\n",
"from matplotlib import pyplot as plt\n",
"from tqdm import tqdm\n",
"import numpy as np\n",
"import os"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Instantiate Sig53 Dataset\n",
"To instantiate the Sig53 dataset, several parameters are given to the imported `Sig53` class. These paramters are:\n",
"- `root` ~ A string to specify the root directory of where to instantiate and/or read an existing Sig53 dataset\n",
"- `train` ~ A boolean to specify if the Sig53 dataset should be the training (True) or validation (False) sets\n",
"- `impaired` ~ A boolean to specify if the Sig53 dataset should be the clean version or the impaired version\n",
"- `transform` ~ Optionally, pass in any data transforms here if the dataset will be used in an ML training pipeline\n",
"- `target_transform` ~ Optionally, pass in any target transforms here if the dataset will be used in an ML training pipeline\n",
"\n",
"A combination of the `train` and the `impaired` booleans determines which of the four (4) distinct Sig53 datasets will be instantiated:\n",
"- `train=True` & `impaired=False` = Clean training set of 1.06M examples\n",
"- `train=True` & `impaired=True` = Impaired training set of 5.3M examples\n",
"- `train=False` & `impaired=False` = Clean validation set of 106k examples\n",
"- `train=False` & `impaired=True` = Impaired validation set of 106k examples\n",
"\n",
"The final option of the impaired validation set is the dataset to be used when reporting any results with the official Sig53 dataset.\n",
"\n",
"Additional optional parameters of potential interest are:\n",
"- `regenerate` ~ A boolean specifying if the dataset should be regenerated even if an existing dataset is detected (Default: False)\n",
"- `eb_no` ~ A boolean specifying if the SNR should be defined as Eb/No if True (making higher order modulations more powerful) or as Es/No if False (Defualt: False)\n",
"- `use_signal_data` ~ A boolean specifying if the data and target information should be converted to `SignalData` objects as they are read in (Default: False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Specify script options"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"figure_dir = \"figures\"\n",
"if not os.path.isdir(figure_dir):\n",
" os.mkdir(figure_dir)\n",
"\n",
"cfg = conf.Sig53CleanTrainQAConfig\n",
"# cfg = conf.Sig53CleanTrainConfig # uncomment to run for real\n",
"\n",
"ds = ModulationsDataset(\n",
" level=cfg.level,\n",
" num_samples=cfg.num_samples,\n",
" num_iq_samples=cfg.num_iq_samples,\n",
" use_class_idx=cfg.use_class_idx,\n",
" include_snr=cfg.include_snr,\n",
" eb_no=cfg.eb_no,\n",
")\n",
"\n",
"creator = DatasetCreator(ds, seed=12345678, path=\"sig53/sig53_clean_train\")\n",
"creator.create()\n",
"sig53 = Sig53(\"sig53\", train=True, impaired=False)\n",
"\n",
"# Retrieve a sample and print out information\n",
"idx = np.random.randint(len(sig53))\n",
"data, (label, snr) = sig53[idx]\n",
"print(\"Dataset length: {}\".format(len(sig53)))\n",
"print(\"Data shape: {}\".format(data.shape))\n",
"print(\"Label Index: {}\".format(label))\n",
"print(\"Label Class: {}\".format(Sig53.convert_idx_to_name(label)))\n",
"print(\"SNR: {}\".format(snr))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plot Subset to Verify\n",
"The `IQVisualizer` and the `SpectrogramVisualizer` can be passed a `Dataloader` and plot visualizations of the dataset. The `batch_size` of the `DataLoader` determines how many examples to plot for each iteration over the visualizer. Note that the dataset itself can be indexed and plotted sequentially using any familiar python plotting tools as an alternative plotting method to using the `torchsig` `Visualizer` as shown below."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# For plotting, omit the SNR values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class DataWrapper(SignalDataset):\n",
" def __init__(self, dataset):\n",
" self.dataset = dataset\n",
" super().__init__(dataset)\n",
"\n",
" def __getitem__(self, idx):\n",
" x, (y, z) = self.dataset[idx]\n",
" return x, y\n",
"\n",
" def __len__(self) -> int:\n",
" return len(self.dataset)\n",
"\n",
"\n",
"plot_dataset = DataWrapper(sig53)\n",
"\n",
"data_loader = DataLoader(dataset=plot_dataset, batch_size=16, shuffle=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Transform the plotting titles from the class index to the name"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def target_idx_to_name(tensor: np.ndarray) -> list:\n",
" batch_size = tensor.shape[0]\n",
" label = []\n",
" for idx in range(batch_size):\n",
" label.append(Sig53.convert_idx_to_name(int(tensor[idx])))\n",
" return label\n",
"\n",
"\n",
"visualizer = IQVisualizer(\n",
" data_loader=data_loader,\n",
" visualize_transform=None,\n",
" visualize_target_transform=target_idx_to_name,\n",
")\n",
"\n",
"for figure in iter(visualizer):\n",
" figure.set_size_inches(14, 9)\n",
" plt.savefig(\"figures/00_iq_data.png\")\n",
" break"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Repeat but plot the spectrograms for a new random sampling of the data\n",
"visualizer = SpectrogramVisualizer(\n",
" data_loader=data_loader,\n",
" nfft=1024,\n",
" visualize_transform=None,\n",
" visualize_target_transform=target_idx_to_name,\n",
")\n",
"\n",
"for figure in iter(visualizer):\n",
" figure.set_size_inches(14, 9)\n",
" plt.savefig(\"figures/00_spectrogram.png\")\n",
" break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Analyze Dataset\n",
"\n",
"Loop through the dataset recording classes and SNRs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class_counter_dict = {\n",
" class_name: 0 for class_name in list(Sig53._idx_to_name_dict.values())\n",
"}\n",
"all_snrs = []\n",
"\n",
"for idx in tqdm(range(len(sig53))):\n",
" data, (modulation, snr) = sig53[idx]\n",
" class_counter_dict[Sig53.convert_idx_to_name(modulation)] += 1\n",
" all_snrs.append(snr)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the distribution of classes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class_names = list(class_counter_dict.keys())\n",
"num_classes = list(class_counter_dict.values())\n",
"\n",
"plt.figure(figsize=(9, 9))\n",
"plt.pie(num_classes, labels=class_names)\n",
"plt.title(\"Class Distribution Pie Chart\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(11, 4))\n",
"plt.bar(class_names, num_classes)\n",
"plt.xticks(rotation=90)\n",
"plt.title(\"Class Distribution Bar Chart\")\n",
"plt.xlabel(\"Modulation Class Name\")\n",
"plt.ylabel(\"Counts\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the distribution of SNR values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(11, 4))\n",
"plt.hist(x=all_snrs, bins=100)\n",
"plt.title(\"SNR Distribution\")\n",
"plt.xlabel(\"SNR Bins (dB)\")\n",
"plt.ylabel(\"Counts\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 495b949

Please sign in to comment.