Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add javascript tokenizer #4

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

jakubzitny
Copy link

@jakubzitny jakubzitny commented Jun 3, 2016

Hey,

I am working (in coordination with @crista) on a javascript tokenizer for SourcererCC. This is the first working concept written in node.js, so we can test it out and communicate better.

This is a Work In Progress.
I don't think this is good enough to be merged yet, we should think about better way to manage the tokenizer "modules" in SourcererCC. It would be great to be able to switch between tokenizers in the the .properties config and to have (at least roughly) the same api. What do you think? Maybe

I will post updated version with better preprocessing of JS files and a script for generically running the tokenization.

- copied from https://git.io/voeEf
- works on both file and function level
const outputFile = function(functionTokens) {
functionTokens.forEach((f) => {
//console.log("===")
console.log(f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's specify an output file and print our results to that file. Sometimes developers might want to debug using plain old school console.log approach and so let's use console only for the debug or auxiliary info.

@saini
Copy link
Contributor

saini commented Jun 3, 2016

Hey Jakub,
Interesting profile pic :).
Thanks for submitting a PR. I went through the code and it looks good to me to start from here. I have provided a few code review comments and once you can address them we can merge your code into the master.

we should think about better way to manage the tokenizer "modules" in SourcererCC. It would be great to be able to switch between tokenizers in the the .properties config and to have (at least roughly) the same api. What do you think? Maybe

Yes, we should do that. For now, let's note it down in a new ticket.

I will post updated version with better preprocessing of JS files and a script for generically running the tokenization.
Sure. How about first we work on merging the existing code and test it. After this we can work on the newer better versions?

Thanks again for the great work
Vaibhav

@jakubzitny
Copy link
Author

jakubzitny commented Jun 4, 2016

Hi @saini, thanks for the code-review, I agree with most of the comments. I'll polish the code a bit, update the MR and also file some ideas and issues I had with SourcererCC in general.

@saini
Copy link
Contributor

saini commented Jun 5, 2016

Sounds good 👍

@jakubzitny jakubzitny force-pushed the javascript-tokenizer branch from 962360b to 49cd0c2 Compare June 20, 2016 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants