Pumpernickelthethird
Highest Rated Comments
Pumpernickelthethird1 karma
It seems like you use the test data to evaluate the accuracy of one "generation" of trees as you call it. If I understand your approach correctly, then you produce lots of random trees, prune them and pick the ones with the highest score, similar to a random forest method, right?
But since you measure the accuracy of one tree against the test score before picking the best one you're using knowledge about the test set in your final predictive model, effectively producing a highly overfit tree that cannot gerneralize whatsoever.
Pumpernickelthethird3 karma
Alright, I'm not very proficient in Rust so I may have misinterpreted your code. Still, I see a lot of problems with this approach, the use of primitive math functions as tree nodes which seems kind of a random and computationally inefficient thing to use, the lack of detail on how data is prepared, the striking similarity to decision trees and random forests, the general simplicity of the process and your explanations, etc.
I don't intend to be a naysayer without delivering any solid proof explicitly pointing out faults in your code, but I don't have the time to review your project thoroughly enough. I'd advice you to post your project to more specialized communities like /r/machinelearning in order to get some input by people proficient in the field instead of posting to /r/IAma where you won't find much technical knowledge.
Anyway, I like your dedication and creativity and hope you'll keep at it and create more interesting and non-traditional stuff in the future.
View HistoryShare Link