Why AI Governance Is Now a Testing Problem?
This shift is explored in Episode 24 of KJR’s Trusted AI podcast, where KJR ACT General Manager Andrew Hammond sits down with Tony Allen, Executive Director of the Age Check Certification Scheme.
This isn’t just a theoretical conversation. KJR has worked directly with ACCS as part of the Australian Government’s Age Assurance Technology Trial, evaluating real-world technologies, including AI-based systems designed to estimate a user’s age. That experience brings a practical lens to the discussion, grounded in testing how these systems perform under real conditions, not just how they’re designed to behave.
The episode looks back at the latest realities of trusted AI adoption, cutting through the hype to examine what actually worked, what surprised practitioners on the ground, and where risks began to surface. It also looks ahead, highlighting how AI is evolving across regulation, implementation, and everyday use.
Through this conversation, a clear theme emerges: the evolution of AI isn’t just a technical story, it’s operational, regulatory, and deeply human. And for those leading quality engineering and testing across Australia, one thing is becoming clear: AI governance is no longer a compliance exercise, it’s a core testing responsibility.
“Just because something is flagged as AI-enabled… doesn’t necessarily mean that we’re ready for it, or it’s ready for us.” – Andrew Hammond ACT GM KJR
The Illusion of “AI-Enabled”
One of the more telling observations from Tony Allen cuts straight through the noise:
“We are seeing a lot of things described as AI-enabled… when in reality, they’re not.”
This isn’t just a marketing problem, it’s a testing problem.
When products are labelled as AI-driven without actually incorporating adaptive or learning components, it creates confusion in validation strategies. Test teams are left asking:
- Are we testing a model or a rule-based system?
- Should we expect variability in outputs?
- Where does accountability sit when something goes wrong?
For QA leaders, this reinforces the need for clear AI governance definitions within delivery environments. Without that clarity, teams risk applying the wrong testing approaches to the wrong systems.
AI Fails Differently; So Testing Must Evolve
Traditional systems fail predictably. AI systems don’t.
This is illustrated with a simple but powerful example: an AI trained to validate passports may accept an image with a dog’s face, because it was never trained to recognise that as invalid.
This is not a defect in the conventional sense. It’s a limitation of training.
For testing professionals, this shifts the focus away from expected outcomes and toward unexpected behaviour.
What This Means in Practice:
- You’re not just validating correctness, you’re probing boundaries
- Edge cases are no longer optional, they are essential
- Test scenarios must include what the system hasn’t seen before
This is where testing AI systems becomes fundamentally different from traditional software testing.
Data Quality Is the Foundation of Trust
In AI systems, data is not just an input, it is the system. Poor data quality doesn’t result in isolated defects. It creates systemic issues:
- Bias in decision-making
- Blind spots in edge cases
- Inconsistent or misleading outputs
This example highlights a deeper truth: if the model isn’t trained to detect something, it won’t, even if it seems obvious to a human.
For Australian organisations, this raises governance questions that testing teams can’t ignore:
- Who is accountable for training data quality?
- How is that data validated and refreshed?
- Are test datasets representative of real-world complexity?
Software quality assurance must now extend upstream into data assurance.
Automation Bias: The Risk No One Is Testing For
Beyond technical limitations, there’s a human risk that’s harder to detect, automation bias.
AI systems are inherently persuasive. They present outputs with confidence, often reinforcing user assumptions.
“It’s constantly reassuring the user that they’re on the right track.” Tony Allen, Executive Director of the Age Check Certification Scheme
This creates a feedback loop where users:
- Trust outputs without sufficient scrutiny
- Overestimate the system’s capability
- Fail to challenge incorrect results
In high-stakes environments, legal, healthcare, or compliance, this can have serious consequences.
This introduces a new dimension:
- How do you test not just the system, but how people interact with it?
- How do you design validation strategies that account for over-trust?
This is where AI governance intersects with human behaviour, and where testing must adapt accordingly.
Standards Are Coming And They Will Change Delivery
The emergence of ISO 42001 (AI Management Systems) signals a shift toward structured governance.
The standards typically evolve:
- Early adopters implement them
- Competitors follow
- Procurement begins requiring them
- They become industry baseline
“The thing that really kicks it off is when it starts to be specified in procurement.” - Tony Allen, Executive Director of the Age Check Certification Scheme
For those in DevOps, test automation, and delivery leadership, this has real implications:
- Governance requirements will become part of CI/CD pipelines
- Evidence of compliance will need to be testable and auditable
- Quality gates will extend beyond functionality into accountability
This is not a future concern, it’s already starting to appear in procurement conversations across Australia.
DevOps and Test Automation Must Expand
As AI becomes embedded in delivery pipelines, DevOps practices must evolve to support the full AI lifecycle.
This includes:
- Continuous validation of model outputs
- Monitoring for model drift over time
- Embedding governance checks into automated pipelines
- Testing under real-world variability, not just controlled conditions
Traditional performance testing also needs to adapt.
It’s no longer just about response times, it’s about:
- Consistency of outputs under load
- Latency in AI inference (especially for edge and Vision AI systems)
- Behaviour when inputs fall outside expected patterns
Without these capabilities, organisations risk deploying systems they don’t fully understand.
Vision AI and the Real World
The rise of Vision AI and assistive technologies is one of the most promising, and risky, developments.
From smart glasses that describe environments to AI-enabled accessibility tools, these systems are interacting directly with the physical world. It highlights the potential, particularly for people who are visually impaired, where these technologies can be transformative in enabling greater independence and real-world exploration.
But with that potential comes responsibility. Testing must now account for:
- Environmental variability (lighting, movement, obstructions)
- Contextual accuracy (is the system interpreting correctly?)
- Safety implications (what happens if it gets it wrong?)
This is where AI governance becomes critical, because the cost of failure is no longer just technical, it’s human.
AI Strategy Without Governance Is a Risk
Many organisations are moving quickly to define their AI strategy, but governance often lags behind.
This gap creates exposure:
- Systems deployed without clear accountability
- Inconsistent testing approaches
- Limited understanding of risk boundaries
AI strategy must be built on a foundation of governance and governance must be validated through testing.
What This Means for QA and Testing Leaders
For test managers, QA leads, and senior practitioners across Australia, the role is evolving.
1. Redefine Quality
Quality now includes:
- Trustworthiness
- Transparency
- Ethical behaviour
- Data integrity
2. Treat AI Governance as Testable
Governance is not documentation, it’s something that must be:
- Verified
- Measured
- Continuously monitored
3. Expand Testing Practices
- Introduce adversarial testing for AI systems
- Validate training and test datasets
- Test for edge cases and unknown scenarios
4. Align with Emerging Standards
Standards like ISO 42001 will shape:
- Procurement requirements
- Delivery expectations
- Regulatory compliance
The Shift Ahead
AI is no longer a feature sitting on top of systems; it is embedded within them. It influences decisions, behaviours, and outcomes in ways that are not always visible or predictable. And that changes the role of testing. The challenge is no longer just building AI; it’s building AI that can be trusted.
For KJR and the broader quality engineering community, this presents a defining opportunity: To lead the transition from hype to accountability, and to ensure that as AI scales, quality scales with it.
The implications of AI governance are not confined to a single use case, they’re reshaping how quality, risk, and accountability are managed across industries. Explore how these challenges are playing out in different sectors and what it means for organisations navigating AI adoption at scale.
Partner with KJR to ensure your AI is tested, trusted, and ready for the real world.
Frequently Asked Questions (FAQs)
Because governance principles, like fairness, transparency, and accountability, can’t remain theoretical. They must be validated in practice through testing, monitoring, and measurable controls across the AI lifecycle.
Traditional systems are deterministic and predictable. AI systems are probabilistic and adaptive. This means testing must focus less on fixed expected outputs and more on edge cases, unexpected behaviours, and how systems perform under uncertainty.
Mislabelling creates confusion in testing strategies. Teams may apply AI-specific validation methods to rule-based systems, or miss critical risks in genuine AI systems, leading to gaps in quality, accountability, and governance.
In AI systems, data effectively is the system. Poor-quality or biased data leads to systemic issues, not isolated defects, impacting decision-making, fairness, and reliability at scale.
Automation bias is the tendency for users to over-trust AI outputs. It matters because even technically “correct” systems can create risk if users don’t question results, making human interaction a key part of what needs to be tested.
By introducing adversarial and exploratory testing approaches, deliberately feeding systems inputs they weren’t trained on to uncover blind spots, limitations, and failure modes.
They provide structured frameworks for managing AI responsibly. Over time, these standards are likely to become procurement requirements, meaning organisations will need to demonstrate compliance through auditable testing processes.
Pipelines must expand to include model validation, data quality checks, drift monitoring, and governance controls, ensuring AI systems remain reliable and compliant over time, not just at release.
They operate in unpredictable environments. Testing must account for real-world variability, lighting, movement, context, and safety implications, where failures can directly impact people, not just systems.
They risk deploying systems with unclear accountability, inconsistent testing, and unmanaged risks. Governance provides the structure needed to ensure AI is safe, trustworthy, and aligned with regulatory expectations.
It highlights that AI systems often behave differently outside controlled environments. Testing must reflect real-world complexity, diverse users, edge cases, and unpredictable inputs, to ensure systems are truly fit for purpose.
- Case Studies

Local Government Authority ArcGIS Platform
Independent testing assured ArcGIS scalability during peak flood events, giving Council confidence when performance mattered most. Find out how !

Chatbot Widget Testing for Water Authority
KJR assured a secure, reliable omni-channel chatbot experience, validating MFA identity checks, seamless handoffs, and data integrity – providing smarter, connected, self-service digital interactions. Find out more !

Test Automation Framework for Water Corporation
KJR implemented a test automation framework for a government-owned retail water corporation, delivering faster, more reliable software releases, reduced manual testing, and improved accuracy across complex integrated systems. Read how !

State Retail Water Corporation
KJR was engaged by a major retail water corporation to test and validate an AI-enabled IVR. End result – delivered flawless deployment, zero defects, proven resilience, and verified transcription accuracy. Find out how !

Local Government
A large Australian Local Government
organisation engaged KJR to plan, execute and report on performance testing for its new corporate website ensuring it could reliably handle peak traffic, particularly during emergencies.

Government owned retail water corporation
A Government owned retail water corporation providing essential services engaged KJR to apply its enterprise software quality engineering expertise to the customer’s Maximo regression testing challenges.

Large Government Health Department
KJR upgraded a government health department’s LoadRunner testing platform, resolving performance issues and improving system stability. KJR’s efforts ensured the department could deliver resilient, high-performance healthcare services without disruption.

Government body responsible for running state elections
This large state government statutory authority that runs state, local council and statutory elections, faced a challenge with slow performance when processing returned envelope barcodes. This issue resulting in delays to determining election outcomes. KJR were engaged to deliver performance testing on their Election Management System.





