AI Jailbreak & Prompt-Injection Security Expert

$99,999 yearly

Job Description

AI Jailbreak & Prompt-Injection Security Expert

 

Role Type: Contractor

 

Location: Remote

 

 

Scope of Work

Design and implement advanced methodologies for evaluating AI system safety, focusing on ethical jailbreaks, LLM red teaming, prompt injection, and tool-use abuse scenarios.
Create comprehensive cross-domain elicitation strategies to uncover multi-turn and complex adversarial bypass patterns in AI models.
Develop, maintain, and update regression test suites that systematically test for jailbreak susceptibility and prompt-injection vulnerabilities.
Construct robust evaluation frameworks that stress-test AI models against real-world adversarial threats, aiming to enhance overall system robustness.
Collaborate with technical stakeholders to translate security findings into actionable improvements for model safety and risk mitigation.
Document methodologies, findings, and best practices in clear, well-structured written reports and presentations for both technical and non-technical audiences.


Preferred Qualifications

5+ years of expertise in adversarial machine learning, LLM red teaming, AI safety evaluation, or a closely related security domain; 8–20 years preferred for senior contributors.
Proven experience researching, testing, or uncovering vulnerabilities related to ethical jailbreaks, prompt injection, tool-use abuse, or adversarial AI attacks.
Advanced degree (PhD, MS) in computer science, cybersecurity, machine learning, or a relevant discipline, or equivalent operational/professional background.
High credibility and recognition within the AI security or adversarial ML community—such as published research, open-source tools, or conference presentations.
Exceptional written and verbal communication skills, with a strong focus on clear documentation and collaborative problem-solving.
Prior participation in multi-disciplinary projects or cross-functional AI safety initiatives is a plus.
Familiarity with current LLM architectures, prompt engineering techniques, and security assessment tools is highly desirable.