Understanding Artificial Intelligence Through the IRB Lens

QT Press
5 days ago
5 min read

Artificial intelligence is accelerating faster than many institutions can regulate it. While AI tools offer significant advantages for researchers, they also introduce new ethical, privacy, and data protection challenges that require careful oversight. This is why the Institutional Review Board continues to play a critical role in protecting the rights, safety, and welfare of human participants, especially as digital research methods become more common.

This article provides a comprehensive overview for researchers who are using or planning to use AI tools. It shows how the IRB evaluates artificial intelligence, how internet mediated research is evolving, and what investigators must consider to maintain responsible, ethical, and compliant research practices.

The Increasing Use of AI in Human Subjects Research

As internet mediated research expands, many investigators have turned to AI tools to support their work. Artificial intelligence refers to computer systems designed to mimic human intelligence to perform tasks such as robotics, decision making, game strategies, and machine learning.

AI as a digital resource is not new, but public use increased significantly when generative AI tools like ChatGPT and Midjourney became widely accessible. These tools allow users to receive fast, refined responses that normally require extensive time, effort, or specialized knowledge to produce.

Researchers may use AI tools to help with generating ideas, analyzing text, organizing information, summarizing large volumes of data, or supporting certain administrative tasks. While these benefits are significant, they also create new ethical questions that must be reviewed by the IRB.

Internet-Mediated Research and Why Oversight Matters

Internet-mediated research has become an important method for collecting data from diverse and credible online populations. It offers fast recruitment, broad reach, and flexible data collection methods. However, online environments come with risks that differ from traditional, in person research.

In-person research involves physical interaction at some stage of the study. Internet mediated research allows investigators to recruit, communicate, collect data, and analyze results entirely through digital platforms. Although this method increases access to information, it also creates challenges related to hacking attempts, inaccurate or falsified data, automated bot responses, and general data reliability.

Because online data is more vulnerable to breaches or manipulation, investigators must remain cautious about where data is obtained and how it is stored, transferred, and protected. Research data must be kept on approved and secure applications. Any new digital tool, both software and hardware, must undergo review and approval to ensure that privacy, confidentiality, and security standards are fully met. This helps researchers prepare for potential risks that arise from online environments that are often difficult to predict or regulate.

Bots and Automated Interference in Online Research

One of the most common challenges in internet mediated research is the presence of bots. Bots are AI driven programs created to complete tasks quickly, often for financial compensation from online surveys or incentives. These programs can distort research data, compromise study quality, and interfere with participant selection.

Although there are ways to identify patterns that differ between human and automated responses, AI is evolving in ways that make detection more challenging. Basic safeguards, such as attention checks, may reduce bot interference but cannot fully eliminate it. As artificial intelligence continues to advance, researchers must adopt stronger methods of verification and remain aware that automated responses may still bypass screening.

When Does “I’m just using ChatGPT/Whisper/Claude/etc.” Trigger IRB Scrutiny?

Almost always. Here’s why the IRB cares:

How You’re Using AI	Why the IRB Classifies It as More Than “Just a Tool”
Transcribing interviews with Whisper	Creates a verifiable record from private, identifiable human data; accuracy affects data integrity and participant privacy
Summarizing or coding open-ended responses with LLMs	The model is performing analysis on identifiable or potentially re-identifiable human-subjects data
Generating synthetic respondents or mock data	Risk of model hallucinating realistic but false confidential information that could be treated as real
Using AI to screen or recruit participants	The AI is interacting with or making decisions about living individuals
Feeding identifiable data (even temporarily) into any third-party API	Data leaves your institution’s control; most cloud AI services are not BAA-signed or FedRAMP-authorized by default
Using AI image/audio tools on participant photos or voice recordings	Creates risks of deepfakes, misuse, and non consenting synthetic media
Asking an LLM to “clean” or “anonymize” verbatim quotes	Proper redaction is a research procedure and LLMs frequently fail at true de-identification

What the IRB Actually Cares About When You Use AI

Whether an investigator uses natural language models to analyze transcripts, uploads data to machine learning systems, or uses AI for automated data extraction, the IRB generally examines four core areas:

1. Human Subjects Protections

The IRB must determine whether AI tools will directly interact with participants or indirectly analyze personal data. Even AI that operates only on recordings or transcripts may still count as human subjects research if the data contain identifiable information.

2. Data Privacy and Security

AI tools often require large datasets, cloud processing, or third-party systems. IRBs will ask:

Where is the data stored?
Who has access?
Does the AI vendor store or reuse the data?
Is encryption used in transit and at rest?
Does the system comply with HIPAA, FERPA, GDPR, or CCPA where applicable?

3. Algorithmic Transparency

IRBs increasingly evaluate how the AI works, especially when used to classify, infer, or generate content about participants. Researchers do not need to reveal proprietary code, but they must explain:

What the AI does
What its limitations are
How errors are handled
Whether decisions are human-reviewed

This is particularly important in qualitative research where nuance, emotion, and context matter.

4. Risks of Bias

Because AI models can inherit or amplify biases, the IRB may ask:

How was the model trained?
Could the AI produce biased results about certain groups?
Will humans verify the output before conclusions are drawn?

Bias is a major IRB concern and often leads to requests for clearer mitigation strategies.

Examples from Recent Protocols

A nursing study using GPT-4 to code open-ended patient feedback → Returned because no BAA with OpenAI and no mention in consent. Fixed in one revision cycle after switching to an on-premises model.
An education dissertation using Claude to summarize student focus groups → Approved after adding consent language and keeping all data in Anthropic’s VPC enterprise instance.
A qualitative sociology study using Whisper for transcription → Initially rejected because the free tier sends data to OpenAI. Solved by using the self-hosted Whisper large-v3 model on university HPC.

Common IRB Questions About AI

Researchers often encounter questions such as:

Does the AI provider use uploaded data to train its algorithms?
Can the data be permanently deleted?
Are transcripts or recordings anonymized before AI processing?
Will human researchers verify AI-generated outputs?
Can participants opt out of AI analysis?
How will incidental findings be handled (for example, AI detecting mental health indicators)?

Preparing answers in advance improves the approval process.

IRBs Are Not Anti AI

Some researchers assume IRBs will block AI tools. That is not the case. IRBs support technical innovation, but they must ensure that:

Human subjects are protected
Data is handled responsibly
Ethical risks are minimized
Consent forms are transparent
Research integrity is preserved

When presented clearly and responsibly, many AI-enabled studies receive quick approval.

The Role of Human Oversight in AI-Powered Research

Even with advanced AI, human oversight is essential for:

Monitoring accuracy
Handling sensitive interviews
Evaluating nuance or emotion
Interpreting cultural context
Ensuring ethical compliance

For example, in transcription, AI can generate quick drafts, but human transcription services like Qualtranscribe provide the accuracy, cultural context, and confidentiality that IRBs prefer for sensitive or qualitative projects.

Conclusion

Understanding AI through the IRB lens is now essential for any researcher working with human data. AI tools can accelerate research, but they also introduce privacy, bias, and transparency risks that IRBs must evaluate carefully. When researchers proactively explain their AI methods, secure data properly, and maintain human oversight, AI can be used ethically and effectively without delaying IRB approval.