Experimenting with AI-Scientist: An AI-Powered Paper Review Tool
As a researcher, I'm always intrigued by new tools that can enhance the academic process. Recently, I came across an interesting project called AI-Scientist, developed by Sakana AI. This tool promises to review academic papers using artificial intelligence. Curious about its capabilities, I decided to put it to the test with a couple of my own published papers.
Setting Up the Environment
I followed the setup process outlined in a blog post and the official GitHub repository. Here's a quick rundown of the steps:
1. Clone the repository:
git clone https://github.com/SakanaAI/AI-Scientist.git
2. Install the required dependencies:
pip install -q anthropic aider-chat backoff openai
pip install -q pypdf pymupdf4llm
pip install -q torch numpy transformers datasets tiktoken wandb tqdm
According to the official documentation, texlive-full is required to generate papers, but it is very heavy to use colab.
I just wanted to request a review this time, so skipping it didn't seem like a problem.
3. Set up the OpenAI API key (I used Google Colab's userdata for this):
import os
from google.colab import userdata
api_key = userdata.get('OPENAI_API_KEY')
os.environ['OPENAI_API_KEY'] = api_key
Running the AI Review
With the environment set up, I was ready to test the AI-Scientist on my papers. I used the following code to perform the review:
import openai
from ai_scientist.perform_review import load_paper, perform_review
client = openai.OpenAI()
model = "gpt-4o-mini-2024-07-18"
paper_txt = load_paper("my-paper.pdf")
review = perform_review(
paper_txt,
model,
client,
num_reflections=5,
num_fs_examples=1,
num_reviews_ensemble=5,
temperature=0.1,
)
The Results
I tested the AI-Scientist on two of my published papers:
- "Mining Students' Engagement Pattern in Summer Vacation Assignment"
- "Supporting Reflective Teaching Workflow with Real-World Data and Learning Analytics"
Surprisingly, both papers received a "Reject" decision from the AI reviewer, with overall scores of 4 out of 10. Here's a summary of the feedback for the first paper:
Strengths:
- Addresses a relevant topic of Learning Analytics in K-12 education
- Identifies distinct engagement patterns
- Provides empirical data on students' engagement and performance
Weaknesses:
- Lack of methodological details
- Insufficient address of potential confounding factors
- Limited discussion on broader implications
- Inconsistent clarity in writing
Questions posed by the AI:
- Requests for more details on clustering methodology
- Inquiries about addressing limitations in future work
The feedback for the second paper was similar, highlighting strengths in addressing significant educational issues but pointing out weaknesses in methodology and validation.
Reflections
While it's disheartening to see my published works receive "Reject" decisions from the AI, it's important to consider a few factors:
- The AI might be calibrated to very high standards, possibly aiming for top-tier conference or journal quality.
- The tool provides valuable feedback that could be used to improve papers before submission.
- This experiment demonstrates the potential of AI in academic review processes, but also highlights the need for human judgment in interpreting results.
As we continue to integrate AI tools into academic workflows, it's crucial to view them as assistants rather than replacements for human reviewers. They can offer quick, initial feedback, but the nuanced understanding of research context and significance still requires human expertise.
Have you experimented with AI tools in your research process? I'd love to hear about your experiences in the comments!