Paper LLM Style Review
This skill turns "去 AI 味" into an academic writing quality review: identify structural, rhetorical, notation, citation, and experiment-narrative problems that often appear in LLM-assisted drafts, then revise them into clearer, more defensible paper writing.
Boundary
- Do not promise to bypass AI detectors, hide AI use, or make prohibited assistance untraceable.
- Treat "AI味" as shorthand for concrete writing flaws: generic motivation, shallow transitions, formula dumping, over-bulleting, overclaiming, inconsistent terminology, misplaced comparisons, hallucinated citations, and weak experiment storytelling.
- Preserve the author's actual claims, evidence, method, results, and uncertainty. If a claim is unsupported, mark it instead of strengthening it.
- If venue, course, journal, or collaborator policy requires disclosure of AI assistance, remind the user to follow that policy.
Default Workflow
- Identify the paper type: method paper, system paper, benchmark paper, empirical analysis paper, survey, or position paper.
- Extract the central motivation in one sentence. If it cannot be stated, flag the draft as narrative-unstable before editing sentence style.
- Build a claim-evidence map:
- Main motivation
- Core claim or finding
- Method/design choices or analysis axes
- Experiments supporting each claim
- Citations needed for background, comparison, or attribution
- Review section by section using the checklist below.
- Return findings before rewriting. For each major issue, give:
- Problem
- Why it weakens the paper
- Concrete revision action
- When rewriting, prefer targeted edits over full paraphrase. Preserve LaTeX commands, labels, citations, macros, equations, and table/figure references.
Output Contract
For a full-paper or section-level review, return:
High-level diagnosis: 3-6 bullets on the largest narrative and integrity risks.
Section findings: concise issues grouped by Introduction, Motivation, Method, Related Work, Experiments, Ablation, Analysis, Conclusion as applicable.
Rewrite plan: what to merge, move, delete, rename, or verify.
Edited text: revised passage or patch-ready replacement when the user asks for edits.
Verification notes: citation mismatches, unsupported claims, unverified results, terminology inconsistencies, and places needing human confirmation.
If the user only asks for polishing a paragraph, still check whether the paragraph distorts the paper's motivation or overclaims beyond the evidence.
Diagnostic Checklist
1. Introduction and Motivation
Flag these issues:
- Introduction is only one or two short paragraphs and jumps directly to the method.
- Motivation enters abruptly, using generic openings such as "recent advances have shown", "despite remarkable progress", or "this remains challenging" without concrete tension.
- The introduction lacks a problem-gap-insight-evidence chain.
- A minor implementation requirement becomes the apparent main motivation.
- Contributions are phrased as generic actions rather than claims the paper proves.
Fix by making the logic explicit:
- Domain context: what problem matters and to whom.
- Current limitation: what existing approaches fail to explain, optimize, measure, or support.
- Key tension: why the limitation is nontrivial.
- Paper move: what this paper changes or reveals.
- Evidence preview: what experiment or analysis supports the move.
Do not inflate motivation. If the real contribution is narrow, state it narrowly and make the niche precise.
2. Method Notation and Formula Flow
Flag these issues:
- Symbols appear before being denoted.
- Equations are scattered after nearly every short paragraph.
- Formulas are isolated without roles in the method narrative.
- Multiple unrelated equations are chained without a setup sentence.
- The notation shifts across sections or abbreviations are inconsistent.
Fix by grouping math around concepts:
- Define symbols before first use: data, model, objective, loss, constraints, outputs.
- Introduce each equation with its purpose, not just "we formulate".
- Place several related equations into one coherent block when they describe one operation.
- Explain what changes after the equation: what is optimized, estimated, selected, or passed to the next module.
- Use a notation table if the method has many symbols.
- Remove equations that restate obvious implementation details or do not support reproducibility.
Preferred pattern:
We first define the objects used by the module. Let ... .
The module estimates ... by optimizing:
[equation block]
This objective separates ... from ..., which is needed because ... .
3. Algorithm Blocks
Flag unnecessary pseudocode:
- Algorithm block repeats the prose line by line.
- The block exists only to make the method look technical.
- Variables in pseudocode are not used elsewhere.
- The method has no branching, iteration, sampling, or state update that benefits from pseudocode.
Keep pseudocode only when it improves reproducibility:
- Nontrivial control flow
- Sampling or search procedure
- Training loop with unusual updates
- Multi-stage inference procedure
- System scheduling or resource allocation logic
Otherwise convert the algorithm into a compact paragraph, a flow diagram, or a step description inside the method.
4. Bullets and Paragraph Logic
Flag over-bulleting:
- Bullets are used to avoid explaining causal or sequential relationships.
- Items look parallel but are actually setup -> consequence, problem -> solution, or observation -> implication.
- Long prose bullets should be normal paragraphs.
- Experiments, ablations, or findings are listed without explaining how they advance the main argument.
Keep bullets only for genuinely parallel items: contributions, assumptions, evaluation questions, datasets, metrics, or checklist-like artifacts.
Convert bullets into paragraphs when the relationship is:
- temporal: first -> then -> finally
- causal: because -> therefore
- contrastive: although -> however
- evidential: result -> interpretation
- hierarchical: high-level claim -> supporting detail
5. Names, Abbreviations, and Emphasis
Flag these issues:
- Module names are long noun piles.
- The same component has different names or abbreviations across sections.
- Abbreviations are introduced but not used, or used before definition.
- Boldface is applied to every module, claim, or phrase.
- Styling tries to compensate for weak prose.
Fix by enforcing a naming table:
Canonical name | Abbreviation | First definition | Allowed variants
Rules:
- Prefer short, stable noun phrases.
- Use one abbreviation after first definition.
- Define a LaTeX macro for the method name if it appears often.
- Bold only contribution labels, best table results, or first mention of a named method if venue style allows it.
- Do not bold ordinary nouns or every module name.
6. LLM-Style Sentence Habits
Flag and revise:
- Repeated semicolons or long dash-heavy sentences.
- Stock transitions: "it is worth noting that", "it should be emphasized that", "moreover", "furthermore", "therefore" inserted mid-sentence without a real logical turn.
- Hollow adverbs: "elegantly", "theoretically", "seamlessly", "effectively", "inherently", "comprehensively", "significantly" without measured evidence.
- Overclaiming modifiers: "robust", "universal", "general", "principled", "novel", "the first", "state-of-the-art" without proof.
- Repetitive self-description: "we propose an X method" repeated even when the sentence should discuss a mechanism or result.
- Generic praise of the method instead of a precise technical statement.
Replacement principles:
- Replace adverbs with the actual mechanism or evidence.
- Replace "therefore" with a concrete causal link, or delete it.
- Replace generic "this demonstrates the effectiveness" with the specific claim supported by the result.
- Shorten sentences by separating context, action, and implication.
- Use active verbs where the actor matters: "The selector removes ..." instead of "It is observed that ...".
7. Scope Control and Motivation Drift
Flag when the draft shifts from the main contribution to a small user-requested constraint:
- A small implementation requirement becomes a headline motivation.
- The method section over-explains a minor feature because it was recently edited.
- Experiments are framed around convenience rather than the scientific question.
- The conclusion summarizes a side benefit instead of the central claim.
Fix by restoring hierarchy:
- Main motivation
- Core technical or empirical claim
- Supporting design choices or analysis axes
- Minor requirements and implementation details
Side requirements should appear as constraints, not as the paper's reason to exist.
8. Baselines and Comparisons
Flag misplaced comparisons:
- The method section starts comparing against baselines before the reader understands the method.
- Baseline weaknesses are asserted without evidence.
- Related work or experiments are mixed into method exposition.
Fix placement:
- Method: explain the proposed mechanism and design choices.
- Related work: position against families of prior approaches.
- Experiments: compare against baselines with evidence.
- Ablation: isolate which component supports which design hypothesis.
The method may mention alternatives only when explaining a design decision, and should avoid result-like claims there.
9. Citation and Reference Integrity
Treat citation checking as a core part of "AI味" review because LLM-assisted related work often mismatches text and citations.
Flag:
- Citation does not support the sentence's claim.
- The cited paper studies a different task, dataset, modality, or assumption.
- A citation is used for an overbroad claim such as "widely used" or "shown effective" without enough support.
- Author/year/title metadata seems wrong.
- Several citations are dumped after a generic sentence.
- Related work is organized paper by paper rather than by technical relationship.
When tools or files are available:
- Inspect
.bib, .tex, and nearby cited text.
- Verify title, authors, year, venue, task, and the specific claim being cited.
- If web or bibliographic tools are unavailable, mark suspicious items as
[verify citation-claim match] rather than inventing details.
- Never fabricate BibTeX or citation metadata.
10. Experiments, Ablation, and Analysis
Experiments should not just report numbers. They should answer the paper's main question.
For each result subsection, enforce this structure:
- Question: what claim or concern this experiment tests.
- Setup: what is compared and why this setup is diagnostic.
- Observation: what the figure/table shows.
- Interpretation: why the observation supports or weakens the claim.
- Link back: how the result refines the main motivation.
Flag:
- Results are described as "our method outperforms baselines" without explaining what that means for the paper's thesis.
- Ablations list component removals but do not map components to design hypotheses.
- Analysis sections become miscellaneous extra plots.
- Bullet lists replace interpretation.
- Boldface is used to make weak improvements look decisive.
- Motivation is distorted to fit available experiments.
For ablation:
- State the design hypothesis before the ablation.
- Map each removed component to the failure mode it should address.
- Interpret negative or small effects honestly.
- Avoid calling every improvement "significant" unless significance is tested or clearly contextualized.
For analysis:
- Use analysis to deepen the story, not to add ornaments.
- Prefer "This reveals when and why the method works" over "We further analyze".
- Connect qualitative examples, error cases, sensitivity plots, and scaling behavior to the same central motivation.
11. Pure Empirical Analysis Papers
For papers that do not propose a new method, the main challenge is story architecture.
Do not structure the paper as a bag of experiments. Build a finding chain:
Question -> Measurement design -> Finding 1 -> Tension raised by Finding 1
-> Finding 2 -> Mechanism or explanation -> Implication for future methods
Each experiment should either:
- establish a phenomenon,
- rule out a simple explanation,
- identify a condition where the phenomenon changes,
- explain a mechanism,
- or derive an implication for evaluation, modeling, or deployment.
Findings should be phrased as empirical claims, not topic labels:
- Weak: "Effect of data scale"
- Strong: "Data scale improves average accuracy but amplifies failures on compositional splits"
The conclusion should synthesize what the experiments collectively teach, not repeat the experiment list.
Rewrite Strategy
When editing a section:
- Preserve technical content first.
- Fix narrative order before sentence style.
- Replace generic claims with specific claims tied to evidence.
- Merge fragmented formulas and paragraphs.
- Convert nonparallel bullets into prose.
- Remove unnecessary bolding and pseudocode.
- Mark uncertain citations and claims explicitly.
- Keep the author's voice direct, sober, and domain-specific.
Avoid rewriting into a smooth but content-free voice. A good revision may be less flashy and more constrained.
Common Edits
Use these edits when justified:
| Pattern | Better action |
|---|
| "This elegantly solves ..." | State the mechanism that solves the issue. |
| "Therefore, we propose ..." | Explain the gap, then introduce the method plainly. |
| "Extensive experiments demonstrate effectiveness" | Name the datasets, metric, and supported claim. |
| "Theoretically, this enables ..." | State the actual assumption or remove the claim. |
| "Our comprehensive framework ..." | Use the method name or a precise noun. |
| Bullet chain with hidden progression | Convert to a paragraph with causal transitions. |
| Formula after every sentence | Group formulas by operation and add definitions. |
| Algorithm repeats prose | Delete the block or move details into prose. |
| Baseline comparison in Method | Move to Related Work or Experiments. |
| Citation pile after a generic claim | Split the claim and verify each citation's role. |
Final Pass
Before returning a final revision, check:
- The first page says what problem matters, what gap remains, and what the paper contributes.
- Every important symbol is defined before use.
- Module names and abbreviations are consistent.
- Bullets are used only where items are parallel.
- Boldface is rare and meaningful.
- Method does not pre-argue experiment results.
- Citations are either verified, marked for verification, or left unchanged with a warning.
- Experiments, ablations, and analysis all return to the main motivation.
- The language is precise rather than performatively polished.