Making Artificial Intelligence Real: Lessons Learned from Georgia

In our previous blog^[1] published on September 3, we discussed Georgia’s efforts to develop Artificial Intelligence (AI) prototypes for managing the risk of erroneous treasury payments. Since then, Georgia has made notable progress in implementing the solution. On March 15, 2025, the Treasury went live with its AI model^[2] designed to flag high-risk payment transactions (the “red channel”).^[3] Over the first three months of deployment (March–May), the model reviewed over 800,000 payment transactions, and its performance, when benchmarked against decisions made by Treasury staff in finding a risky invoice, showed promising results:

True Positives (95.2%) – Consistent with expert review.
False Negatives (3.9%) – Reflecting conservative behavior, the model flagged some transactions that were ultimately approved.
False Positives (0.8%) – Indicate areas for further refinement of the model.
True Negatives (0.1%) – Confirm the model’s ability to detect problematic payments.

With over 95 percent of decisions in finding erroneous payment orders aligning with those of experienced Treasury officers, the model demonstrates its potential to meaningfully support expert review. Continued training of the model will help close these existing gaps. Today, the model operates in real time and is under continuous monitoring. Based on performance, it is periodically retrained and updated with new data.

Challenges Encountered:

The journey from prototype to production was, however, not without obstacles. Several practical and institutional challenges had to be addressed along the way:

System performance was an early hurdle, especially as the volume of payment transactions increased. Running and testing the model under real-world conditions sometimes led to slower transaction processing or temporary disruptions that needed resolution.
Refining the AI logic was essential to improve its credibility. In some cases, the model’s responses were unclear or inconsistent, particularly due to gaps in the training data.
Maintenance of the model was a key consideration. As the payment environment evolves, the model's accuracy can decline over time. This means regular retraining is necessary to keep it aligned with new data and emerging patterns.
Making AI decisions explainable remains a complex challenge. While the system can reliably flag suspicious transactions, it does not always clearly explain why a payment was deemed risky. The Treasury is currently exploring solutions that would help identify which elements of a transaction contribute most to its risk score, making oversight easier for staff.
Team capacity and coordination were critical. Implementing AI required a multidisciplinary team – including data experts, programmers, system architects and Treasury professionals – to ensure the model aligned with operational needs and institutional safeguards.

These challenges were addressed through close collaboration, careful testing, and commitment to iterative improvement, and have provided invaluable lessons for building a resilient and adaptive AI-supported Treasury system.

Lessons Learned

The following main lessons can be distilled from Georgia’s experience:

Deploying AI in PFM processes is not a plug-and-play solution. AI Model development is an iterative process that involves testing multiple algorithms, followed by iterative tuning and evaluation. Only after this rigorous process, the AI models may be deployed in a live system.
Model must be adequately tested for a sufficient length of time to help analyze its performance with real data and its behavior and possible implementation challenges.
As they don’t need high computing power, AI models can be implemented in parallel without many changes needed in the existing Treasury IT system. The Georgian Treasury deployed its solution on the existing IT infrastructure. However, a careful consideration of the deployment architecture helps ensure that the performance of the existing treasury system is not adversely impacted.
Successful deployment requires a multidisciplinary team. In-house IT Team to support the AI implementation helped in managing risks in dealing with confidential and private data in Treasury system.
Model training and deployment is not a onetime process. Though the retraining process is currently manual, this has been manageable given the infrequent need for updates.
Human intelligence brings ethical judgment, contextual understanding and strategic foresight. Treasury professionals can interpret nuances, consider policy implications and evaluate complex, ambiguous situations that AI algorithm cannot fully capture. In the Georgian Treasury, AI is designed to augment, not replace human oversight. It filters high-risk transactions and reduces workload, while final decisions, escalations and accountability remain with Treasury officials.

Next Steps

Encouraged by the results of AI and its impact on the PFM process efficiency, the Treasury is also planning to expand the use of AI in the processing of “Green Channel” payments and use latest AI technologies like Generative AI^[4] to further improve the accuracy of the existing models by analyzing electronic documents. Beyond transaction processing, the Treasury aims to deploy AI solutions in other core activities, including cash management, accounting/financial reporting and enhancing consultancy services for public sector entities served by the Treasury. These efforts will further integrate AI into the broader Treasury ecosystem, improving service delivery and operational efficiency across multiple functions.

[1] AI-Powered Treasury: Georgia’s Innovative Approach to Managing Payment Risks (by Davit Gamkrelidze, Giorgi Mchedlishvili, Graham Prentice, Alok Verma, September 3, 2024). Available at: https://blog-pfm.imf.org/en/pfmblog/2024/09/ai-powered-treasury

^[2]An AI model is a computer program trained to learn from data and make decisions or predictions based on patterns it finds.

^[3]Red channel transactions are reviewed centrally by Treasury staff before payment instructions are sent to banks, while green channel transactions are sent directly to banks without Treasury review

^[4]Generative AI is a type of artificial intelligence that can create new content—such as text, images, music, or code—by learning patterns from existing data and generating original outputs that resemble human-made work

Making Artificial Intelligence Real: Lessons Learned from Georgia

RECENT

Strengthening State-Owned Enterprise (SOE) Governance in The Gambia: The Case for Board Reform

The Importance of Change Management in Achieving Successful PFM Reforms

Reflections on 20 Years of PEFA in Latin America

ABOUT THE BLOG