Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Practical AI Assurance Workshop

Build organisational capability for testing and evaluating generative AI.

See full workshop outline

A hands-on workshop designed for organisations seeking to deploy Large Language Models (LLMs) and generative AI applications with confidence, quality, and trust.

This practical half-day workshop equips technical teams, AI practitioners, and technology leaders with proven techniques for testing and evaluating LLMs and related generative AI applications before and after deployment.

Combining expert guidance with practical exercises, participants will learn how to assess AI quality, identify risks, implement effective testing strategies, and establish evidence-based assurance processes that support responsible AI adoption and business outcomes.

What You'll Learn

Key risks and quality characteristics of LLMs
Evaluation metrics and practical approaches for assessing LLM performance
How to create automated tests that detect:
- Hallucinations
- Bias and fairness issues
- Safety risks
- Privacy concerns
Monitoring and evaluating LLMs in production environments
Collecting evidence and reporting outcomes as part of an AI governance and assurance process

Key Information

AI Engineers and Developers
Software Testers and Quality Engineers
Technical Leads and Architects
Data Scientists
Product Owners and Delivery Managers
Risk and Governance Professionals
Technology Leaders responsible for AI adoption

Workshop Topics

Session 1: Evaluating Large Language Models

Introduction to Large Language Models

Overview of LLMs: Definition, evolution, and significance
Patterns for Gen AI applications.
Key characteristics of LLMs: Scale, capabilities, and common applications

Testing Large Language Models

The Importance of Testing in LLM Development.
Types of tests for LLMs: Functional testing, performance testing, safety testing
Test case design: Considerations for comprehensive and effective test cases

Evaluation Metrics and Techniques

Common metrics for evaluating LLMs
Understanding qualitative evaluations: Human judgment, interpretability, and explainability
Challenges in evaluating LLMs: Subjectivity, bias, and ethical considerations

Practical Exercise

Hands-on activity: Designing test cases for a given LLM scenario
Group discussion: Sharing insights and strategies for effective LLM testing

Session 2: Practical Techniques & Best Practices in LLM Testing

Advanced Testing Techniques

Techniques for uncovering biases and ensuring fairness in LLMs.
Testing for robustness: Adversarial testing, stress testing.

Optimising LLMs Through Iterative Testing

The role of iterative development in refining LLMs
Incorporating feedback loops: Continuous testing and refinement
Case studies: Examples of iterative testing and optimisation in LLMs

Best Practices and Future Directions

Best practices in testing and evaluating LLMs
Emerging trends and future challenges in LLM testing
Ethical considerations and responsible AI

Interactive Session

Applying Advanced Testing Techniques

Wrap-Up

Key Takeaways
Resources and Next Steps

What's Included?

Expert-Led Workshops

Learn directly from recognised leaders in AI assurance, software quality, and trusted AI adoption.

Interactive Learning

Participate in practical exercises designed to help you apply testing and evaluation techniques in real-world scenarios.

Customised Learning Pathways

Build practical AI capability through training shaped to your team’s roles, responsibilities, and current challenges. We tailor the content to support confident, responsible AI adoption.

This program is delivered by KJR

KJR is a founding member of the Queensland AI Hub

KJR is an Australian Software Quality Engineering Consultancy and a leading practitioner in Trusted AI Adoption. Founded on nearly 30 years of experience in quality assurance and verification, we help organisations unlock real business value from AI, ensuring deployments are not only compliant and ethical but also strategically aligned to drive innovation, efficiency, and growth.

Your Trainers

Dr. Mark Pedersen

KJR CTO
ACS AI in Society Committee Member

Mark is an IT professional with a passion for digital culture. As CTO, Mark’s technical knowledge and lateral thinking abilities are counted on to lead critical software projects safely out of the red and into the light. And, as a world-class software risk analyst and advisor, he thrives on the satisfaction of bettering lives with technology that actually works.