Skip to content

Strudel: Detecting structure in verbose CSV files via classifying lines and cells.

License

Notifications You must be signed in to change notification settings

lanchiang/strudel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Strudel

Detecting structure in verbose CSV files via classifying lines and cells.

Getting Started

Installing

  • This project is implemented in Python 3.7.7.
  • Use the following command to download all required libraries for Python:
pip install -r requirements.txt
  • We recommend to install the required libraries in a separated virtual environment.

Executing program

  • Use the following script to run the Strudel program:
python run_strudel.py

The following arguments can be used for the above script:

  • -d: training dataset

  • -t: test dataset. If not given, the program does cross-validation on the training dataset

  • -f: dataset path

  • -o: output path

  • Results are stored in a csv file.

Version History

  • 0.1
    • Initial Release

License

This project is licensed under the Apache License 2.0 License - see the LICENSE.md file for details

Acknowledgments

Contact

Please contact Lan Jiang if you have any questions or want to report bugs.

Reference

If you find this repository useful in your work, please cite our EDBT'21 paper:

@inproceedings{jiang2021structure,
  title={Structure Detection in Verbose CSV Files.},
  author={Jiang, Lan and Vitagliano, Gerardo and Naumann, Felix},
  booktitle={EDBT},
  pages={193--204},
  year={2021}
}

About

Strudel: Detecting structure in verbose CSV files via classifying lines and cells.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages