An efficient tool for converting raw stackoverflow data dump into .csv format. The processing speed is around 50k rows/second for python csv conversion and around an order faster for scala spark solution.
The data is available here: https://archive.org/details/stackexchange