Sakana.ai’s mascot. “Sakana” means “fish” in Japanese. [Sakana AI]
Slop. That’s a term sometimes thrown around to describe the increasing volumes of AI-generated content appearing online. AI writing doesn’t exactly have a reputation for quality—or accuracy.
Yet an AI writing system from Tokyo-based startup Sakana AI has reportedly passed peer review, the gold standard for scientific validity. Fittingly, the paper’s subject was AI itself. Sakana AI submitted its paper, titled “Compositional Regularization: Unexpected Obstacles in Enhancing Neural Network Generalization,” to a workshop at the prestigious International Conference on Learning Representations (ICLR) 2025. Reviewers were aware some submissions could be AI-generated but didn’t know which. This blind approach tested whether AI-authored papers could withstand rigorous peer review. Out of 43 submissions, three were fully AI-generated, yet only Sakana AI’s cleared the acceptance threshold. The company posted the paper and supporting files on GitHub.
Cracks in the AI façade
A closer examination of reviewer comments reveals some weaknesses in the AI-authored paper. Reviewers noted definitional imprecision, contradictions between claims and presented data, and a general lack of nuanced understanding. For example, one reviewer highlighted that the paper needed to “be more precise” about embedding hidden states, while another flagged contradictory claims regarding attention mechanisms.
According to the abstract: “Neural networks excel in many tasks but often struggle with compositional generalization—the ability to understand and generate novel combinations of familiar components.” It continues: “Our experiments on synthetic arithmetic expression datasets reveal that models trained with compositional regularization do not achieve significant improvements compared to baseline models. Increasing the complexity of expressions exacerbates difficulties regardless of compositional regularization, highlighting challenges in enforcing compositional structures in neural networks.”
Sample comments from the peer review
Sakana AI’s paper received an average reviewer rating of 6.33, just above the acceptance threshold of 6. Though not unequivocally praised, it may be the first fully AI-generated paper to pass peer review without human edits or interventions. According to Sakana AI, this milestone paper was produced entirely by their upgraded “AI Scientist-v2” system, automating everything from hypothesis generation to manuscript writing. Sakana AI highlights this paper as a significant milestone in AI research automation.
Beyond human: Can AI reshape scientific publishing?
Sakana’s achievement isn’t merely academic curiosity—it raises fundamental questions about trust in AI-generated content. Sakana AI sees this milestone as a glimpse into the future, suggesting that advanced AI systems may eventually produce papers “at and beyond human levels,” even at the highest tiers of scientific publishing. Sakana AI anticipates AI’s capabilities will “continue to improve, potentially exponentially,” eventually reaching or surpassing top-tier human standards.
Concerns remain significant and several academic institutions have guidelines in place related to the use of AI in academic publishing:
Organization | Policy | Concerns |
---|---|---|
NIH | Prohibits AI in grant peer review | Integrity, confidentiality |
NSF | Restricts AI tool use; mandates disclosure | Integrity, authenticity |
ASM Journals | Bans AI images; requires AI disclosures | Accountability, confidentiality |
COPE | Stresses ethical AI use, authorship clarity | Accountability, ethics |
University of Maryland | Warns against sensitive data input in AI; mandates AI disclosure | Privacy, accuracy, bias |
The economics of AI-driven publishing resemble trends in software development: high-output, low-cost, but often inconsistent in accuracy. Sakana estimates an AI-authored paper costs significantly less than human-produced research, hinting at future shifts in publishing economics. However, despite potential volume advantages, AI-driven papers may lack the targeted nuance of human-led research.
Ironically, the Sakana AI experiment demonstrated AI’s capability to pass human peer review, but the greater value may lie in its potential as a tool to augment rather than replace human efforts.
Images are for reference only.Images and contents gathered automatic from google or 3rd party sources.All rights on the images and contents are with their legal original owners.