SCIgenAMA29 karma

Jeremy: we explicitly avoided Markov chains or anything else that was technically challenging, in the service of trying to make the papers as funny as possible. With Markov chains, you might get something syntactically correct, but it is likely to be boring.

With SCIgen, we literally sat around for two weeks and just brainstormed buzzwords, clauses, paragraph structures and other paper elements just based on what we thought would be funny. That's the grammar. Then SCIgen itself just goes through the grammar and makes random choices to fill stuff in. That's why you see things like "a testbed of Gameboys" in the evaluation sections sometimes -- we just thought it would be hilarious.

SCIgenAMA27 karma

Jeremy: Yeah, this is pretty standard arms-race stuff. I think it would be trivial to beat that detector, and they could then beat THAT generator, and so on. At some point it's easier just to do "minimally competent peer review", right?

Though as I said in another response, one reasonable use for such a detector is to find people that have already used SCIgen to pad their CVs in the past. It's hard to believe, but such people actually exist! I swear I am not one of them, though some conference rejections I've received might imply otherwise.

SCIgenAMA11 karma

Jeremy: putting the fake conference together was probably my favorite part. Setting up shell corporations, getting disguises, tricking the hotel into thinking we had a real purpose there -- it was like we were getting a real taste of what it was like running WMSCI!

That, and the fame and fortune.

SCIgenAMA10 karma

Jeremy: The highest profile ones I know of are the Springer and IEEE journals: http://www.nature.com/news/publishers-withdraw-more-than-120-gibberish-papers-1.14763. Those ones are pretty interesting actually, because I don't think it was the intention of the submitters to expose the journals as fraudulent -- they were just trying to pad their own resumes!

That said, those particular journals are not considered prestigious. They were just using a well-known brand name. Any actual prestigious conferences use peer review, as they should.

SCIgenAMA5 karma

Max here: I loved the claim for a while that ROOTER was the most widely-read CS systems paper. I wonder if that is still or was ever true?