[removed]

Comments: 164 • Responses: 51  • Date: 

lowkey-goddess102 karma

Former ML engineer here. How does this differentiate from traditional ML approaches like decision trees? If it differentiates a lot, then what approach does it most resemble?

Second, can it handle large, multidimensional datasets with complex relationships, similar to deep learning approaches? Does it have forward/back propagation or some training mechanism?

Lastly, and more of a comment, I feel like AI is a misnomer for what's actually happening in the field. It automates a particular decision in a limited domain, and leverages statistical techniques and usually concepts in linear algebra and calculus to garner sometimes useful results, especially when the computing is distributed.

God, I must sound pretentious in my last paragraph. I'm just a tad tired of hearing the term AI thrown around and it misleading folks. I like people who build stuff, and this seems interesting.

Over_Intention334222 karma

I will start from the end: Yes, the term AI is slapped on everything a bit too easily. But I think about it in the same way think about electricity. Years ago it was only about lighting a bulb, now it's semiconductors. So, despite both being 'electricity' one is more advanced than the other. It's similar to AI now, we are just at lighting a bulb stage. A tree in Primeclue can be thought of as a complicated equation, so it's hard for me to compare it to existing ML approaches. It can handle large datasets, it does not have gradient descend but it "learns" from existing "trees" and uses their parts to find a better solution.

lowkey-goddess26 karma

I upvoted the downvote to keep this thread alive. I hate to be the grumpy skeptic, but may you point to some approach or a paper that inspired this? Anything in the world that you can concretely analogize related to computer science/machine learning/applied mathematics? I want to understand, but I'm not getting much.

Some of the equations and architectures in deep learning can get pretty complicated for some people in the field, and it sometimes takes me a few reads to fully digest a paper, no doubt. But, calling it complex for the sake of it simply being complex doesn't help us understand what this is. We are using the same mathematical principles to build these architectures. What might they be, specifically?

Over_Intention33420 karma

Sorry, I came up with this on my own, although I'm sure someone has tried something similar before (monkeys on typewriters). I can't really point you to any paper describing this.

lowkey-goddess18 karma

I'm not doubting your originality, I believe you. I'm asking about the principles you used to build your algo. Would it help if I went through your source code and asked questions about a particular function/class and what it accomplishes in your program?

Over_Intention33429 karma

Yes, that would be much more productive.

bye-lingual1 karma

(monkeys on typewriters)

Is this a reference to the philosophy of monkeys writing down the whole script of Shakespeare? If so I think I like you and you're incredible for starting (edit: meant stating not starting) that it's not your idea but rather anyone could come up with it, let alone with a typewriter, eventually (:

Over_Intention33424 karma

That's exactly what I was getting at. Checking all research to see if someone tried it before would have taken longer than actually doing this "again".

dangerous_9999 karma

Isn't a neural network just an equation too?

Over_Intention33422 karma

I suppose so.

quohr45 karma

... from your GitHub:

“- Split data randomly into training and testing sets.

  • Train classifiers on training data for 1 minute.

  • Take the best classifier and note its result on test data.

  • Repeat above steps 20 times.

  • Record median result on test data.”

I certainly hope you understand that you CANNOT use the test data in the process of determining which model to use. You go even further and “repeat 20 times”, then take the MEDIAN??

  • Why choose 20 times and not 10 or 30?
  • How can you claim your method avoids overfitting?
  • Have you tried using a validation AND test set instead?

EDIT: OP’s approach is an ensemble of learners and doesn’t touch the test set during training. Thanks for clarifying OP

Over_Intention33425 karma

Primeclue does not look at test data during training. Test data is used for evaluating performance of what it learnt. I never claimed it avoids overfitting.

AnArtistsRendition5 karma

You need a separate validation dataset for that. The test set should only be used for final evaluation.

Edit: Thanks op for the pointer! Seems good to me anyway

Over_Intention33428 karma

The test set is only used for final evaluation. To get better grasp at what is happening see the code at https://github.com/lukaszwojtow/primeclue/blob/dev/backend/primeclue-api/examples/test_training.rs and check where test_data is used.

Pumpernickelthethird1 karma

It seems like you use the test data to evaluate the accuracy of one "generation" of trees as you call it. If I understand your approach correctly, then you produce lots of random trees, prune them and pick the ones with the highest score, similar to a random forest method, right?

But since you measure the accuracy of one tree against the test score before picking the best one you're using knowledge about the test set in your final predictive model, effectively producing a highly overfit tree that cannot gerneralize whatsoever.

Over_Intention33422 karma

No, I train on training data for one minute, many generations. At the end of training I build final classifier (one tree for each class) and test it on test_data.

Pumpernickelthethird3 karma

Alright, I'm not very proficient in Rust so I may have misinterpreted your code. Still, I see a lot of problems with this approach, the use of primitive math functions as tree nodes which seems kind of a random and computationally inefficient thing to use, the lack of detail on how data is prepared, the striking similarity to decision trees and random forests, the general simplicity of the process and your explanations, etc.

I don't intend to be a naysayer without delivering any solid proof explicitly pointing out faults in your code, but I don't have the time to review your project thoroughly enough. I'd advice you to post your project to more specialized communities like /r/machinelearning in order to get some input by people proficient in the field instead of posting to /r/IAma where you won't find much technical knowledge.

Anyway, I like your dedication and creativity and hope you'll keep at it and create more interesting and non-traditional stuff in the future.

Over_Intention33422 karma

Yes, r/machinelearning was my initial choice, but they said IAma is better for self promotion. If someone else reposts it there, I will gladly answer all questions.

quohr3 karma

Okay I see, it’s an ensemble approach - my mistake.

  • Why do you choose 20 rerun cycles and not, for example, 50? Have you tested accuracy vs. total number of repetitions?

  • Why train for “one minute” each time? This would lead to different periods of training depending on the system that the end user is on (e.g, 2000 MacBook Pro vs. a supercomputer)

Over_Intention33422 karma

Re: Why 20? Median seems quite stable when I do 20 runs. Re: Why 1 minute? This is to have some reliability as to when the training ends. Usually people don't know how long 'one epoch' will take, but they know they need answer within certain 'human' time.

quohr11 karma

If you plan on publishing this, I recommend doing a formal test of repetition versus accuracy. I’d imagine it would plateau after some amount depending on whatever factors are involved (particular application, training set size, etc.)

I get what you mean, but computers don’t operate on equivalent timescales. Imagine training your method for a minute using AiMOS versus on a 1990s Macintosh haha.

Plus, the less subjective the better :)

Over_Intention33421 karma

Thanks

t0b4cc021 karma

Usually people don't know how long 'one epoch' will take, but they know they need answer within certain 'human' time.

haha love you dude. you are one of the few practical people who code. i often got annoyed that epochs is the default run and you have to get out of your way in most ml systems to make it time or something else

Over_Intention33421 karma

Thanks

CeladonBadger21 karma

It doesn’t really sound that different from traditional nn. Is it capable of categorisation without basically creating a new model for each class? Does it always have to combine 2 inputs in each node? Is it capable of processing different input size data? It definitely sounds like an interesting project but also like a bit of novelty. No offence, I might be missing something crucial there and I’d love to know more.

Over_Intention33421 karma

Is it capable of categorisation without basically creating a new model for each class? Well, each class has its own tree that answers either 'yes' or 'no' Does it always have to combine 2 inputs in each node? Some nodes take one argument and apply one argument function (i.e. sqrt) I’d love to know more Hence provided source code.

noelexecom16 karma

How is this different enough from traditional methods to warrant an AMA? So far what you describe sounds like 50 year old stuff.

Over_Intention3342-7 karma

Sorry to be of a disappointment.

Whatever4M12 karma

Honestly this sounds a lot like a normal neural network. To make it a question: What would you say is the fundemental difference between your work and an average neural network ?

Over_Intention334212 karma

It doesn't use activation functions nor gradient descent.

serifmasterrace4 karma

If all the nodes are linear operations, the function that the tree is modeling can be collapsed into the form wX+b.

Then we’d just be solving least squares with extra steps right? There’s already a fast analytical solution. Or is there something else I’m missing something here?

Over_Intention3342-2 karma

I don't think it can be collapsed to wX+b.

serifmasterrace3 karma

Any combination of linear operators can be collapsed into the form wX+b.

For example, if you have a tree representing (2X[1]+ 3X[2]) * 4 + 5, it's no different from wX+b where X = matrix([X[1], X[2]]), w = [8,12], b = 5.

max(a,b) is just a constrained linear program.

e^x and x^i are nonlinear, which are operations represented by activations in neural nets.

Your tree is creating some extra linear operations that could be simplified down to greatly improve runtime. Maybe try that, but the solution space being learned won't be different from that of a neural net

Over_Intention33421 karma

Redit removed my post so I probably won't continue this thread here. However I'd like to continue conversation with you. If you feel like it, please contact me via email. Thanks

Rubscrub11 karma

Hi, so I read your github. But isnt this just vefy similar to a ann but with random functions and trees instead of activation functions and backwards propegation?

I would think that by creating random trees and hoping one performs well you're very unlikely to reach a global optimum or even a local optimum. So how does the performance and training time compare to traditional methods?

Over_Intention33422 karma

Perhaps it's similar to other approaches with a lot of "buts". Performance is better at some problems (stocks, sports betting) and worse at others (mnist fashion).

mandown230810 karma

Are you doing it alone? Why your project differs from DL?

Over_Intention33421 karma

Yes. It's completely different algorithm. I don't use any ML/AI libraries as shortcuts.

quohr27 karma

Not using other libraries doesn’t have anything to do with whether what you’ve developed is or isn’t DL though.

Over_Intention33421 karma

Correct. What I meant is that approach is a bit different.

wiwerse8 karma

What lead you down this path?

How did you get started?

How long do you think it is until it's launch ready?

How long have you been working on it?

Over_Intention33426 karma

I had enough with Java

Simple idea for processing data, then I looked for the right programming language

There won't be an official "launch". It works for me just fine.

Over a year, mainly evenings and some weekends.

Miseryy8 karma

Your description of your algorithm seems to suggest it can make decisions in logarithmic time and space, for all inputs, since you describe the input as originating from leaves that merge paths. It's essentially the reverse of an exponential tree.

How would you expect your algorithm to perform on problems that cannot be compressed to a logarithmic number of conjunctive statements/functions? I.e. np hard problems

Over_Intention33425 karma

Can you give an example of such problem with example data? I will take a look

GuyARoss4 karma

subset sum could be one- so given a set {1,23,4,51,21} find n numbers that could produce the sum of a given value OR as close as possible; so this algorithm needs to take into account a precision value as well.

ive tried solving this optimization with a supervised approach before with pretty poor results, so im also curious what your algorithm would yield.

Over_Intention33420 karma

Primeclue can do label classification. I'm not sure what label should be in your example. Can you elaborate?

Excel075 karma

In this field, what kind of Mathematics is a must-know?

Over_Intention33427 karma

It depends, for example decision trees do not require calculus or any such.

Excel073 karma

What is the best programming language for machine learning and why?

Over_Intention33428 karma

If you have to ask I would say python. Simple, has TensorFlow and others.

koalefant3 karma

What does primeclue think about GameStop? Should I buy more or just hold on to the ones I have 🙌

Over_Intention33426 karma

I've never run GameStop values through this software so I don't know.

aetr3yu3 karma

Would you recommend Python over everything else?

Over_Intention33423 karma

Depends on what you're trying, but at the beginning Pythons seems like a save choice.

quohr3 karma

[deleted]

Over_Intention33421 karma

How does primeclue determine how to approximate the solution space? What do you mean? How do you ensure that your approach does not overfit? That's unsolved, itsn't it? Primeclue splits training data into two parts and only one part is used for actual training (something like n-fold validation) Under what circumstances do you believe primeclue would offer an advantage .. ? There is an example called 'test_training'. For some reason TensorFlow fails it miserably but Primeclu gives like 60+ % correctness. Also, it seems like predicting stock market runs works better with Primeclue.

diamondketo2 karma

How are the function nodes in the trees built? Does the user specifies them based on their model or does your algorithm learn to choose the best functions?

Are the non-data leaf nodes free parameters?

  • If yes, how does your algorithm optimize and estimate the best parameter? I understand your algorithm prunes the tree; how does free parameter and pruning come together; do you optimize the free parameter first then prune?
  • If no, how does your algorithm choose the best parameter (e.g., why e, why pi?)

Over_Intention33421 karma

User does not need to specify anything. It all starts randomly.

bsnshdbsb2 karma

Complete noob here. How do I even start working on this field? What should be my path or approach? Should I learn every bit of ML or master a specific. Appreciate any feedback.

Over_Intention33421 karma

Learn a bit of Python and then do some courses, like TensorFlow on Coursera.

LaChicaGo1 karma

What is your favourite programming language? How do you feel about DataRobot and other "black box" programs?

Over_Intention33422 karma

Definitely Rust. I've never heard of DataRobot

eyegazer4441 karma

Have you heard of Replika? How is your algorithm better or different to that?

Over_Intention33421 karma

Never heard of it.

diamondketo1 karma

Why don't you write a technical paper in a statistics journal? Get peer-reviewed by a career statistician.

Over_Intention33423 karma

I'm not interested in scientific career, I'm mainly a programmer.

diamondketo2 karma

Are you not interested in validating whether your algorithm is (1) new and (2) works better than neural network and classification decision trees?

Over_Intention33424 karma

Neither. I'm only interested to make work better and be useful to me and others.

mvsopen1 karma

Where is AI and ML heading? And when will we be forced to adopt a code of ethics for future AI development?

Over_Intention33425 karma

Code of ethics for AI my seem like artificial brakes on what it's capable of, so I hope ethics must be on human side during application of AI results.

Edit: What I meant is: I hope we won't have to hard code ethics into AI, we as humans must be more careful how we apply AI.

thekillerdonut2 karma

It has already been demonstrated that people will not responsibly apply machine learning AI (some examples listed here), either intentionally or because they aren't aware that it isn't a perfectly accurate system. In light of this, as the person creating this type of technology, do you feel an ethical responsibility to apply ethics while you still have control over it?

I realize this is a fairly pointed question. I ask because I was very interested in going into AI research while I was in college, but the more I learned what people did with this type of technology, the more contributing to it deeply violated my own code of ethics.

Over_Intention33421 karma

Absolutely yes!

umop_apisdn1 karma

Do you have to define the complete architecture - ie size of the binary tree and the function at each node - beforehand, or are these learnt as well?

Over_Intention33421 karma

User does not need to define architecture, although there is some control over things like how often a tree creates a branch, how deep initial trees are and so on.

Kyloman1 karma

What advice to you have for people wanting to learn how to create their own Artifical Intelligence projects?
I am decently adept at coding, but it's such a huge and complicated topic I have no idea where to start.

Over_Intention33421 karma

Read a lot to get some creative juices flowing.

jpropaganda1 karma

Have you heard of conducto? Do you think that would increase your pipeline processing? www.conducto.com

Over_Intention33421 karma

I've never heard of it, thanks for the link.

ex_D0T41 karma

Hello, I'm not very familiar with things like this. Is there anything you can suggest to get started with coding? I see a lot of courses but I would like to hear from someone who codes. I've been interested but I can't find something to start with without feeling like I'm doing something wrong.

Over_Intention33422 karma

Nothing better than actually getting started. There are plenty of manuals for Python for example.

ex_D0T41 karma

Is there one you'd say is the best? Or are they all virtually the same?

Over_Intention33422 karma

Sorry, I've learnt Python as my fifth or sixth language so I didn't use a manual. But it seems like this:

https://www.learnpython.org/

is ok. Good luck!

thenielser1 karma

Are you planning on writing an actual research paper instead of a github page?

Would be nice to see a clear and concise paper explaining the differences and theoretical background.

Over_Intention33421 karma

No such plans. Mainly because code changes often, any paper would be outdated before it's finished.

melancholic_inertia1 karma

[deleted]

Over_Intention33421 karma

No.

retrorectum1 karma

Thanks for doing this and it sounds interesting. I have couple of questions: 1. What's the false positive rate you are going for to be able to use it? 2. How much data will be used 3. What are some cases you are personally worried about to be able to have a good success rate on it?

Over_Intention33420 karma

  1. Depends on the problem.
  2. It can process ten of thousands of rows and still have reasonable results reasonably quickly.
  3. Hmmm... It's not so good on MNIST fashion data set (around 85% accuracy). Probably because there is so many points to look at.

dingoateyobaby0 karma

People misuse the words AI on things that are not truely AI. I believe if AI doesn't gather and manipulate the data by itself than it's not an AI. Is your project a true AI or simply a "program"?

Over_Intention33421 karma

It won't gather data for you, you need to feed it with data first. So by your definition, it's just a program.

bunch_of_particles0 karma

What is your thought on the importance of ethics in AI?

Over_Intention33425 karma

Because computers are soulless, humans must do more to make up for it.

Smile_in_the_mirror-1 karma

What are the chances of an AI becoming self aware?

Over_Intention33421 karma

Quite high, but we aren't there yet.

PolarisLodestar-1 karma

In the midst of crypto/blockchain mania, what do you think of GNY? It’s the first decentralized machine learning system. They’ve been in development for over 2 years and are launching the Mainnet this week!

Over_Intention33422 karma

I'v never heard of it, thanks for the link.

thesearcherofgold-1 karma

What kind applications do you aim this AI to fit in? Is a human-like virtual girlfriend within the realm of possibilities?

Over_Intention33423 karma

This is more about getting answers from data.

kieronhix-2 karma

Do you think we should be cautious of a potential AI uprising like Elon Musk claims he’s afraid of?

Over_Intention33422 karma

No.

Internal-Lifeguard51-6 karma

Do you think your soul has entered your program? I know RNA and other genetic material transfers “memories” through generations. Do you think your AI could unknowingly be channeling your own being?

Over_Intention33422 karma

Never thought about it. Intriguing.