You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just pinging the ppl that seemed to touch these specific LOC.
I know you guys don't maintain this code anymore and have moved on, but I had a quick question in terms of what a specific line is doing. I was wondering if you could provide a quick answer (if you happened to write this part) to make sure I'm interpreting correctly. FYI: I have ported the code to cython and once this issue is resolved, I think we can safely move on :)
are you sampling without replacement the feature index? It looks like rndFeature = randNum->gen(fpSingleton::getSingleton().returnNumFeatures()); can generate a random feature index, but is it possible to have a duplicate?
For example, say you have data with 4 columns, then maybe SPORF will sample a projection of:
indices = [0, 2, 0]
weights = [1, -1, 1]
Note that this in turn isn't a sparse linear combination with only +/- 1's, but now has a +2, -1 weight when doing the linear combination. Or is this function guaranteed to not have duplicates in its sampling of the projection matrix?
The text was updated successfully, but these errors were encountered:
Hey @adam2392, I worked mainly in the R part of things although I do remember having a similar issue with this chunk (lots of whiteboarding). I never did figure out if this block was sampling in accordance with the SPORF paper -- and given your example, I'd say it's not.
In that case the indices should be sampled without replacement -- going from memory.
I did tinker around in the C++ code, but the base functions came from James.
I know James had some code in his own repo, which may have some tests in it 🤷🏼♂️ -- he'd be the one with the most knowledge about how it works.
Hi @MrAE, @jbrowne6 and @falkben
Just pinging the ppl that seemed to touch these specific LOC.
I know you guys don't maintain this code anymore and have moved on, but I had a quick question in terms of what a specific line is doing. I was wondering if you could provide a quick answer (if you happened to write this part) to make sure I'm interpreting correctly. FYI: I have ported the code to cython and once this issue is resolved, I think we can safely move on :)
In
SPORF/packedForest/src/forestTypes/binnedTree/processingNodeBin.h
Lines 99 to 113 in a7a3c7e
rndFeature = randNum->gen(fpSingleton::getSingleton().returnNumFeatures());
can generate a random feature index, but is it possible to have a duplicate?For example, say you have data with 4 columns, then maybe SPORF will sample a projection of:
Note that this in turn isn't a sparse linear combination with only +/- 1's, but now has a +2, -1 weight when doing the linear combination. Or is this function guaranteed to not have duplicates in its sampling of the projection matrix?
The text was updated successfully, but these errors were encountered: