The main risks in AI-assisted manuscript work are fabricated citations, unsupported claims, sensitive data exposure, unclear authorship, and overclaiming what the analysis can prove. data2paper.ai is designed around reviewability rather than one-click submission.
Reference verification workflow
| Step | Purpose | Researcher check |
|---|---|---|
| Source identification | Find candidate literature through PubMed, DOI records, journals, or agreed databases. | Confirm relevance to the research question and target journal expectations. |
| Citation extraction | Record title, authors, year, journal, DOI, PubMed ID, or other source identifiers where available. | Spot-check important references manually. |
| Claim alignment | Match references to the sentence or claim they support. | Remove references that do not actually support the claim. |
| Format review | Prepare AMA, Vancouver, APA, or target journal formatting when in scope. | Confirm final style and reference manager compatibility. |
Fabricated citations are not accepted. However, final manuscript references should still be checked by the researcher before submission because citation relevance and interpretation are scientific responsibilities.
Data handling principles
Data minimization
The free assessment stage should use metadata only: sample size, variables, study type, outcomes, and expected deliverables. Identifiable patient data should not be sent at this stage.
Written scope
If raw data is required, the project scope should define what data is needed, who can access it, how it is transferred, what outputs are produced, and how data is handled after delivery.
De-identification
Researchers should remove direct identifiers whenever possible and confirm that data use is allowed by consent, ethics approval, institutional policy, and local law.
Least necessary access
Access should be limited to the data needed for the agreed analysis. Any exceptional handling requirement should be stated before work starts.
- Data you are not authorized to process or share.
- Non-de-identified sensitive personal or patient information unless a written agreement explicitly covers it.
- Data restricted by institutional policy, export controls, contract terms, or local law.
- Information that would require clinical, legal, regulatory, or ethics advice beyond the agreed research support scope.
Academic integrity boundaries
- No fabricated data, references, ethics approval, informed consent, or author contributions.
- No ghost authorship, authorship circumvention, plagiarism, or duplicate-submission support.
- No guarantee of acceptance, impact factor, indexing, or publication timeline.
- No clinical diagnosis, treatment advice, patient-care decisions, or regulatory approval advice.
- Final manuscript claims must match the study design, data quality, and statistical evidence.
FAQ
How do you prevent hallucinated references?
References should include DOI, PubMed, journal, or other traceable identifiers where available. The reference list should be checked against the claims it supports. Final citation approval remains with the researcher.
Will raw data be sent to public AI tools?
Raw data handling should follow the written project scope. During assessment, raw data is not required. For paid work, data handling, de-identification, access, and post-delivery treatment should be agreed before work starts.
Who is responsible for the final manuscript?
The researcher is responsible for final scientific interpretation, authorship, disclosures, compliance, and submission decisions.
Start with metadata, not raw sensitive data
For the first assessment, send only project metadata and expected deliverables. We can then decide whether raw data is needed and what written handling terms are required.
Ask about data handling Download questionnaire