How does data2paper.ai prevent fabricated references?

References should trace back to DOI, PubMed, journal, or other verifiable records where available. Fabricated citations are not accepted, and researchers should verify final citations before submission.

Should identifiable patient data be sent during assessment?

No. The free assessment stage should use metadata such as sample size, variable names, study design, and target deliverables, not identifiable patient data.

Reference Verification and Data Security

The main risks in AI-assisted manuscript work are fabricated citations, unsupported claims, sensitive data exposure, unclear authorship, and overclaiming what the analysis can prove. data2paper.ai is designed around reviewability rather than one-click submission.

Reference verification workflow

Step	Purpose	Researcher check
Source identification	Find candidate literature through PubMed, DOI records, journals, or agreed databases.	Confirm relevance to the research question and target journal expectations.
Citation extraction	Record title, authors, year, journal, DOI, PubMed ID, or other source identifiers where available.	Spot-check important references manually.
Claim alignment	Match references to the sentence or claim they support.	Remove references that do not actually support the claim.
Format review	Prepare AMA, Vancouver, APA, or target journal formatting when in scope.	Confirm final style and reference manager compatibility.

Fabricated citations are not accepted. However, final manuscript references should still be checked by the researcher before submission because citation relevance and interpretation are scientific responsibilities.

Data handling principles

Data minimization

The free assessment stage should use metadata only: sample size, variables, study type, outcomes, and expected deliverables. Identifiable patient data should not be sent at this stage.

Written scope

If raw data is required, the project scope should define what data is needed, who can access it, how it is transferred, what outputs are produced, and how data is handled after delivery.

De-identification

Researchers should remove direct identifiers whenever possible and confirm that data use is allowed by consent, ethics approval, institutional policy, and local law.

Least necessary access

Access should be limited to the data needed for the agreed analysis. Any exceptional handling requirement should be stated before work starts.

Do not upload or send:

Data you are not authorized to process or share.
Non-de-identified sensitive personal or patient information unless a written agreement explicitly covers it.
Data restricted by institutional policy, export controls, contract terms, or local law.
Information that would require clinical, legal, regulatory, or ethics advice beyond the agreed research support scope.

Academic integrity boundaries

No fabricated data, references, ethics approval, informed consent, or author contributions.
No ghost authorship, authorship circumvention, plagiarism, or duplicate-submission support.
No guarantee of acceptance, impact factor, indexing, or publication timeline.
No clinical diagnosis, treatment advice, patient-care decisions, or regulatory approval advice.
Final manuscript claims must match the study design, data quality, and statistical evidence.

FAQ

How do you prevent hallucinated references?

References should include DOI, PubMed, journal, or other traceable identifiers where available. The reference list should be checked against the claims it supports. Final citation approval remains with the researcher.

Will raw data be sent to public AI tools?

Raw data handling should follow the written project scope. During assessment, raw data is not required. For paid work, data handling, de-identification, access, and post-delivery treatment should be agreed before work starts.

Who is responsible for the final manuscript?

The researcher is responsible for final scientific interpretation, authorship, disclosures, compliance, and submission decisions.

Start with metadata, not raw sensitive data

For the first assessment, send only project metadata and expected deliverables. We can then decide whether raw data is needed and what written handling terms are required.

Ask about data handling Download questionnaire