From Chaos to System: How We Build Transparent Governance Through Data and Action

Victor Maziarchuk at the UCU Conference: "I don’t believe in artificial intelligence. I believe in the people who use it."

In his lecture titled “From Chaos to System: How We Build Transparent Governance Through Data and Action,” he shared insights about the Center’s projects and how they are connected to artificial intelligence.

Below is the full text of his speech.

Video of the presentation (UA)

Audio of the presentation (ENG)

First of all, it’s always a pleasure to be here. I have a special emotional connection to this place — unforgettable memories.

Today, I’ll be covering a rather specific topic, mostly about money (everyone’s interested in money), and how artificial intelligence can be used in this space. Because when we talk about investments and so on — that’s already a well-developed and classical domain. But I’ll focus more on public money — state finances.

Today, I want to highlight three key aspects: people, data, and vision.

PEOPLE

I don’t believe in artificial intelligence. I believe in the people who use artificial intelligence. I believe in people who know what they’re doing and understand what they want to achieve. That’s the foundation. Let me try to explain why I think this way — through a few examples. We have phenomenal individuals throughout our history — those who shaped our past and those who are now shaping our present. Theorists, practitioners — they are the ones writing our history.

Who knows what this slide represents? It’s about our philosophy. It combines several layers. First and foremost — it’s the professional depth each of us brings. Each person is a specialist in their field. Then — it’s the environment of UCU that brings us together. It’s about collaboration. And I would also add one more layer — values. The core, most important value for us is cross-sectoral cooperation.

Just three years ago, I had nothing to do with data science. My field is public finance. But at some point, I hit a ceiling: I realized I had reached the limit of how far I could go. I was working with large volumes of data, but couldn’t analyze them at a deeper level. So I began to combine two sectors: my 13 years of professional experience — and something completely new for me — data science.

Let me tell you a bit about us. About our team. We are a team at an analytical center — an NGO. But not your typical NGO — we’re highly specialized. You might say, we’re boutique. Our single area of focus is public finance. We don’t branch out into anything else. And we have one more professional focus, which is directly connected to UCU — data science. I’m very glad that my team includes people who are directly or indirectly affiliated with this faculty, from both short- and long-term programs. It’s a real synergy.

I know that graduates from this faculty are in extremely high demand. Yesterday on my way here, a colleague told me, “I’m ready to hire even students who haven’t finished their degrees yet — for the Accounting Chamber — because we urgently need people who can work with data.” I’m someone who’s currently helping them out a bit.

They’re currently facing a major challenge. Why? Because there are massive datasets that no one is working with — or if they are, they’re doing it ineffectively, or without the necessary skills. And that’s a huge opportunity for others — to jump in and start building powerful solutions. That’s why we’re laser-focused: public finance and data science — and nothing else. If you’re interested, you can follow this link and learn more about who we are and what we do.

In short — we work with data and budgets. I may be one of the few people who can see practically everything in the budget — except for classified information and critical infrastructure spending. Why? Because I know exactly where to look — and what to look for. This is my professional domain. And I use data science to make sense of it. So what do we do? We work across four key areas:

We collect and verify data.
We build BI tools that help both the public and government officials understand what’s going on.
We write analytical reports based on our data. We understand the data so well that we often surface insights even government agencies don’t see — because they don’t dive deep enough.
And finally — we share our verified data with the public. We release clean datasets — open and ready to use. These are datasets related to recovery funding. Students from UCU’s certification programs have been using our data for the past 3 or 4 years to build both group and individual projects.

You might be wondering: where’s the artificial intelligence in all this? Well — up to now, there hasn’t been any. But here it comes.

Let’s move to this slide. Here’s where my story about artificial intelligence really begins. We produce a standard Ukrainian-language podcast, and we’ve recently started experimenting with several AI elements in its production. Here’s how it works:

We record a video.
From that video, we generate a transcript.
Our communications team then edits that transcript — cleaning it up, smoothing out awkward phrasing, making it more pleasant to read. So we publish audio, video, and text.

But now, we’ve gone a step further — we’ve started creating a full English-language clone. We translate the text into English using a premium GPT model for quality translation. Then we review and refine that translation manually. Next, we use voice cloning — to generate an English voice version of the podcast — and share it with an English-speaking audience.

This is a small but powerful use case — one that’s relatively easy to do, even without advanced programming skills. All it takes is smart use of the tools that are already available. And that’s exactly what we’re doing.

Podcast Budget Talks (ENG)

DATA

The second major block — is data. It’s something that causes pain — to me and to my entire team. Why pain? Because you dive into the data — and it’s depressing. Let me share a few cases to illustrate what I mean. For me, the core issue we constantly face is this: We always have to verify data from multiple sources, using available benchmarks.

Ukraine is a global leader in open data. And that’s absolutely fantastic. From a communications standpoint, it’s a major achievement. And it’s absolutely right that Ukraine is opening up as much data as possible. If you look at global trends — the world is moving in the same direction. There’s more and more open data available: use it, analyze it, create with it — as long as you understand why.

The Ministry of Digital Transformation runs the Open Data Portal (www.data.gov.ua). Who here knows that resource? There’s a lot of great information on it. Huge credit to the team — they keep opening up more datasets, even during wartime. And yes — you can absolutely use them.

Based on our own experience, let me share a few insights — from our team’s work. The volume of data keeps growing. Some datasets are updated on different schedules. And it’s amazing that a large share of data is refreshed daily — that’s something worth celebrating. We shouldn’t just say “everything’s bad.”

When we meet with international colleagues and say: “Look, we’re doing this, this, and this…” We show them what we’re working on. And they ask: “Where did you get that data?” We say: “It’s open. We pulled it through the API overnight.” And they go: “Wait… is that even allowed? We don’t have anything like that.”

Another key resource for us — and perhaps the most important one — is a platform that’s already 10 years old. It was implemented by the Ministry of Finance in 2015, but originally driven by civil society: the Unified Web Portal for Public Spending (www.spending.gov.ua). This site publishes information about daily government payments — and it’s critical for our work. Because those payment records show exactly what the government or local authorities actually paid for. They are hard facts. Over the past decade, this system has collected between 255 to 260 million payment entries. This year alone, there have already been payments worth 30 billion UAH. You can track daily transactions, get updates, monitor anything you want. You just have to know how to use it.

But when we talk about data, we also have to talk about quality — and that’s where problems arise.

Let me share a real example. The British launched a fantastic initiative — “Pickups for Peace” — helping our military. Of course, we didn’t want to just let them come and go unnoticed. We wanted to welcome and thank them properly. So the Department of International Cooperation at the Lviv Regional Administration organized a small reception — 20,000 UAH for catering, setup, and service. And here’s what’s amazing — you can actually see that transaction. I love showing this for the first time to students who take my public finance course: They go: “Whoa! I can see what my local hospital bought? Or what the government paid for?” And yes — almost everything is there.

Another important case is about the efficiency of public spending. For example, you can easily check how much funding local education departments have not used, even though that money was allocated by the state.

Let’s take the New Ukrainian School (NUS) program: funds are provided for equipment purchases, teacher training, and so on. But in Lviv Oblast, 2.7 million UAH wasn’t used. The regional Department of Education ended up returning that money to the state budget last year.

Did you know that? It’s all out in the open — right on the surface. Just grab it and use it.

And the third resource we work with regularly in our projects is Prozorro (www.prozorro.gov.ua). That’s the platform that contains all public procurement data — essentially, all the planned purchases. And it’s a real goldmine. Why? Because while Treasury payment records don’t have too many fields, Prozorro contains over 25 million procurement records — and a much broader range of data. It’s truly an incredible dataset. Take it, analyze it, use it. These are the core datasets we work with. And now I’ll show you what we actually do with them. But before that, let me share one of the biggest challenges we face: data quality. The idea here is simple: In the data generated by the government, there are many gaps and blank spaces — and someone needs to fill them in.

By scanning the QR code, you can access our 2023 research — an analysis of the availability and completeness of payment information published on the Spending portal. There are several ministries and institutions that do not publish data about their payments. And I completely understand that data related to the security and defense sector, or critical infrastructure, should not be disclosed. But — forgive me — salaries at the Ministry of Economy and the Ministry of Justice were also classified under that category. Why? Because these ministries each have a single program related to critical infrastructure — and so they closed off everything. Do people know this is a problem? Yes, they do. I personally spoke with four deputy ministers about it. And honestly? No one cared. But suddenly, they start to care when you write about it publicly. That’s when they show up and say: “Oh, come on, why are you making noise?” And I reply: “Then fix the problem! I’m not asking you to publish sensitive information — just the general data.” The answer? “It’s complicated.” And I say: “It’s not complicated. You just don’t want to.”

Here’s another interesting case. This is what a payment order published by the Treasury looks like. You can see who paid, to whom, how much, and for what. And right here, there’s supposed to be a Prozorro identifier — a procurement code — so you can understand what exactly was paid for. So I took a portion of the dataset related to construction. And in construction, there must be associated procurements. I used 2024 data: 220 payment records related to capital construction. Out of those, only a portion had the Prozorro identifier. It’s simple: We have a government, and it owns two platforms: Prozorro — for procurements, Spending — for payments. And I want to understand how they’re connected. Turns out, they’re linked in only about one-third of cases — at least when it comes to construction. That means almost 150 billion UAH worth of payments cannot be directly matched to their procurement records. And if you’re building a tool or an interactive dashboard,
this kind of linkage is absolutely essential. Because without key identifiers, it’s incredibly difficult to work with the data.

Another story — one I stumbled upon accidentally. It’s not large in scale, but it’s systemic. From time to time, published payment records disappear. They simply vanish. Here’s how we work: We have our own infrastructure and data storage systems. Every night, we download about five different datasets into our repository. One day, we were doing some routine reconciliation and noticed that a particular payment wasn’t in our stored data — but it had appeared on the Spending website. That made us pause. So we re-downloaded everything and started checking by identifiers. What did we find? Some payment records are added, others are removed. What might get removed? It could be payments related to critical infrastructure — maybe they were accidentally published, someone noticed, and they were taken down. That’s understandable — and totally fine. But in this case, we compared two datasets from 2024, and I found an example where a payment record that was once public is no longer visible online. However, since we download the data daily, we still have it in our archive. This means we know for sure — some information can disappear.

The third issue: incorrect identifiers. Here’s one example — Kharkiv National University paid for a non-existent procurement worth a total of 1,100,000 UAH. The payment referred to a tender that doesn’t exist. Most likely, it was a technical typo — they meant a 2023 tender, but the year was entered incorrectly. But when you’re using automation based on keys — that kind of error breaks everything. This isn’t a systemic problem, but when you’re monitoring specific transactions — it can have an impact.

Another case, also involving data accuracy: The same university paid for a procurement made by a completely different university. Here you can see that Kharkiv National University processed a payment for a tender that was announced by the National Academy of the State Border Guard Service of Ukraine named after Bohdan Khmelnytskyi — a completely different organization and identifier. Was it done on purpose? It’s hard to say. In the previous example, most likely it was a mistake — human error. There’s always human error. Here’s a classic case: The English letter “I” and the Ukrainian letter “І” — they look the same, but they’re not equal in datasets. When you’re working with large volumes of data, especially using keyword-based searches, that kind of detail matters. That’s why, before you start working, you need to understand these nuances. If you don’t, parts of your dataset will simply disappear, your input will be incomplete or distorted, and so will your output. That’s the reality.

Do you recognize this guy? Unfortunately — most people do. On his first day, he established something called DOGE — the Department of Government Efficiency. This is a literal transcript of what he said on camera while signing the executive order: “Today, we’re signing a very important order. It’s called DOGE. I’ll ask Elon to tell you a bit about it, and about some shocking findings. Billions and billions of dollars wasted — fraud, abuse, corruption. I believe that’s why we won the election.” Elon Musk, as a top specialist — including in IT — was brought in to “fix the system.” He had a successful background in business, so expectations were high. What did Musk promise? “I’ll save you 2 trillion dollars real quick.” Some time passed. He said: “Okay, maybe not two — but I’ll get you a trillion.” In the end? 150 million dollars — less than one-tenth of the promise. Why am I showing this slide? Because the philosophy is simple — and it’s something we’ve seen in Ukraine too. When politicians apply pressure and bring in “big-name experts,” especially from IT, who don’t understand the local context, the results can be very questionable.

I’m not sure if you know this young guy — maybe you’ve heard of him. There was quite a scandal, including in Ukrainian media. He’s a young, brilliant IT specialist who was one of the main drivers behind the Department of Government Efficiency (DOGE). But the problem was — he was completely out of context. The promises made to Trump were not fulfilled. Musk left the project. And now, this case is being studied by public finance professionals — as an example of what happens when there’s no systemic approach.

So, what were the issues? In several cases, previously canceled contracts were still included in reports. In one case, a contract worth 8 million was mistakenly shown as 8 billion. That’s why I started this entire talk with people — with professionals who combine multiple disciplines and environments. I didn’t use that slide from UCU’s DataX presentation on T-shaped people by accident. Because here — at UCU — a values-based, professionally strong environment is being formed. I know that whenever I have a challenge in our projects, I can always turn to someone here — and they’ll help, advise, and guide me. And the best part? You can hire people from here. For example, right now I have an opportunity to offer a paid internship for 4 months — for a student to work with data. If anyone’s interested in the future — feel free to reach out.

Here’s another phenomenal case — from the United Kingdom. Possibly one of the biggest IT failures in public sector history. The UK government wanted to digitize the entire healthcare system. Over the years, the total cost of the project surpassed 6 billion pounds. And in the end — the program was shut down. Poor data quality, system rigidity, bureaucracy, and more — they all collided. We need to understand who we’re working with.
If you want to disrupt a system, you have to understand it first.

And one more example — recent and very telling. This one is about a different aspect of working with data: when government bodies intentionally manipulate information to look good. This happened in the context of the Municipal Transparency Ranking, led by Transparency International. The data was rigged. Experts noticed. And the reaction came swiftly — they were called out. “Hands off!” — that was the message.

VISION

The final block — vision. Vision is what truly drives us forward. Everyone has their own vision — professional, personal, or organizational. Let me share my work-related vision. You know the saying — if you want to eat an elephant, you have to do it one bite at a time. That’s exactly what we’re doing — tackling a huge challenge in small, manageable pieces. Let me tell you about three of our key projects:

the recovery of Kyiv Oblast,
the Fund for the Elimination of the Consequences of Armed Aggression,
and the school nutrition reform.

So, what do we actually do? When we established our NGO, we were faced with one central question: What will we focus on? What is our area of expertise? We understood clearly: We are professionals in public finance. And we saw a major gap — there was no single place where you could see: how much the war is costing us, who is receiving the money, and what it is being spent on. So we took the initiative to gather and verify war-related expenditure data in one place. The data spans several categories: For the security and defense sector, we only use general-level data. But when it comes to physical recovery — the information is highly detailed: right down to the specific companies and sites that received payments.

We began by helping the Kyiv Regional Military Administration. Why them? To be completely honest — it’s because some of our colleagues work there. And they were truly interested in what we were doing. They told us: “This is a top priority for us — especially after the scandals reported by Nashi Groshi, Yuriy Nikolov, and others. We’re ready to provide any data you need. Just tell us what you need — we’ll help you collect it all.”

We created this slide after nine months of collaboration. They were preparing to attend URC, the Ukraine Recovery Conference, an annual international event on reconstruction. They showed this slide to international partners with the following message: “Everything being rebuilt in Kyiv Oblast is available online. You can see who received the funds, and which sites and facilities are being restored. Transparency is everything for us.”

It was a small-scale project. Why? Because we showcased only one region and only one source of expenditures. At the same time, it was a complex project — because we thought we knew a lot. But, for example, we realized we didn’t yet have clear algorithms: what exactly we were doing, how we were doing it, what problems we might encounter, etc. When we completed the Kyiv Oblast project, we faced several challenges. The biggest one?
The data held by the Regional Military Administration was scattered across different departments and institutions. It was fragmented — and often just missing.

Our goal was to verify everything. So, what did we do? We created a single unified approach for all of it. And that, today, is a major advantage we have over many other players — including government platforms. We also started using various AI elements to make our work easier. To be honest — the Kyiv Oblast case was done manually. We needed to fully understand the data first — so we could optimize the process later. We do not include data from the security and defense sectors, or from critical infrastructure.

We also haven’t yet worked on the consequences of the war. But we have verified recovery-related expenditures from the state budget, and we’ve begun analyzing spending by international partners. We’re currently collecting the data. As of 2025, there are already 6.5 million payment records. We download new data daily. For example, on May 6, there were nearly 93,000 payments. And sometimes — we need just one. Using our algorithms, we extract only what we need from massive datasets — and build our own custom databases. In 2025, there were already 1.3 million procurements. On May 2 alone — almost 20,000 procurements. Out of those, 77 matched our monitoring criteria. That means we knew they were tied to institutions we monitor. But none of them were classified or selected into the procurement categories we actually needed.

We often reach out to the government and say: “Please provide us with the data.” Or we contact local authorities and say: “We’re seeing discrepancies — can you clarify?” We also reference OpenBudget, a government platform that publishes public reports on budget execution. It’s one of our key benchmarks — because it contains Treasury data. We compare one Treasury dataset to another. And if something’s missing, we go back and say: “Friends, something’s not right with this project — either provide the payment records or explain the discrepancies.” When we present data, we’re ready to account for every single cent.

We even have something of a “black box” approach for institutions that refuse to cooperate. When that happens, we launch a wave of data requests and formal appeals. And here’s a tip: Electronic requests through the Government Portal often work best — because once it’s formalized, agencies usually respond. Once we collect the data, we process it, verify it, structure it, publish it, and write analytical reports. Then we make everything available — for anyone to download and use.

I’ve mentioned this before — the Kyiv Oblast website now features our analytical dashboard. Let me show you what it looks like. What can you see on it? Pretty much everything related to public spending. There are several blocks of information. You can browse: Expenditures from the Recovery Fund (state budget), and other budgetary data. It’s a very convenient system: You can filter the data either by map or by category. You can even search by company — and see where that company is building. You can drill all the way down to a specific reconstructed object, and see how much funding it received. There’s also descriptive statistics for each project. The platform includes: State budget expenditures, Local budget expenditures, and even spending by international partners. Everything in Kyiv Oblast has been gathered together in one place. If you’re interested — you’re welcome to use it.

And one more thing — something fundamental for us: We share our data. We’re probably the only NGO in Ukraine that does this in such a systematic way. And I want others to follow suit. Why? Because it forces you to be structured. And even more importantly — it makes you treat data with the highest level of responsibility. Can we make mistakes? Absolutely. We’re human. But if we make a mistake and someone points it out — we say thank you, we acknowledge it, apologize, understand why it happened, and make sure not to repeat it. That’s our responsibility — to others. We’ve done a huge amount of work. Please — use it. And I especially invite academic institutions — because this is a great data source for meaningful research and analysis.

Our third project is about school nutrition. This is a project championed by Ukraine’s First Lady, focused on modernizing school meals in educational institutions. Here’s how it worked: Three separate teams, not initially connected to each other, were brought together by the European Union Anti-Corruption Initiative to implement the project. Each team had its own specialization: A professional team from Lviv that conducts detailed procurement analysis. They reviewed nearly 200 procurements, assessed them using 34 criteria — determining whether each one was high-risk or not, whether the procurement had happened or failed. A team focused on technical oversight, reviewing cost estimates and checking for inflated prices or unnecessary equipment. As a result of this early-stage intervention, nearly 140 million UAH (from a total of 1.5 billion UAH) was either saved or redirected to other schools — about 10% of the budget. Why? Because they found: inflated prices, equipment that wasn’t needed, and items that weren’t eligible for state funding.

Why did I want to showcase this project? Because here, we implemented daily monitoring — tracking 180 projects every day. We update the data by 9:00 AM each morning. Here’s how it works: At night, the system downloads fresh data. Our algorithms run and evaluate the information. In the morning, a human reviews it. By 9:00 AM, the updated results go live. People who are monitoring the system — including the Ministry of Health — can see what’s happening and make real-time decisions. We also send out a weekly update to the full team, summarizing key events and changes.

And now, a few insights about data quality — which I really want to emphasize. This chart shows the number of tenders related to our project. We began monitoring in August, but changed our technology slightly in mid-September — and started full monitoring then. We kept monitoring. And eventually, we realized that a third of the projects weren’t showing up in our system. That’s a serious problem — and we couldn’t figure out why — even though we’d been monitoring for three months. So we started investigating. It turned out that the company data — the information on who was conducting the procurements — came from state-run registries. And the problem was — those registries had incorrect data on the companies handling these procurements. Once we figured that out, we manually reviewed and corrected everything — and finally arrived at a more or less accurate monitoring dataset.

Why are government projects not always successful? Because the data entry process in state systems involves a wide range of people — people with very different levels of computer literacy. And that directly affects data quality. If you simply take government data at face value, without understanding these nuances, you’ll end up with an incomplete or flawed dataset.

The second issue: The government can change the rules of the game. What do I mean? It can say: “We’re taking funding away from this school — and giving it to that one.” And you, as a data analyst, are left trying to rebuild your dataset. By the end of the year, we had finally reached a point where we could run full and continuous monitoring of everything accessible to us. But when too many people are involved in government data entry,
the result is often an epic fail.

I don’t believe in government projects that don’t have teams dedicated to data verification. Right now, we’re working on several major projects. I didn’t want to get too technical — but let me briefly mention two slides’ worth. We ran an evaluation of different models. We tried and tested both complex models and simpler ones. And guess what? The simpler model turned out to be the most accurate in forecasting and evaluation.

Another aspect — we have verified data. We took our verified dataset and compared it to the state registry data for each project. Here’s what the results looked like: Everything was red. And the more red you see, the greater the discrepancy. So how can anyone make solid management decisions based on this kind of data? You can’t. What’s the root of the problem? Someone forgot to upload the data. Someone else updated it — or didn’t. One agency made a change — another didn’t follow up. This is a major structural issue in the public sector — and almost no one talks about it.

Now, let’s talk about data completeness. Since we track procurements ourselves, our internal database for the school nutrition reform contains almost 1,100 procurements — while the official government platform lists only 600.

Here’s our operational algorithm:

We collect the data.
We filter for what we need.
We store the data in our own infrastructure — which we’ve built ourselves.
Then we process the data.

We have our own model that identifies whether a procurement is related to a project we are tracking or not. We also use a small classification model that categorizes the type of procurement. Nothing overly complex. We’ve developed a series of robust algorithms for tracking payments — especially in cases where data is missing or incomplete.

Verification: We track data completeness at both the overall project level and the level of each individual object. If needed, we collect additional data to fill the gaps. And then — we visualize it all. We also produce a complete English-language version, as I mentioned earlier. There are two identical projects — one for domestic use, and one to show to international partners. We actively check for duplicates, labeling errors, and data accuracy, and we publish weekly summaries. It’s not hard — it’s just consistent work.

My vision – I want to see all capital construction expenditures from the state budget on my dashboard. I want to understand them. I want to monitor them daily. This is part of what’s known as the PIM reform —
a reform strongly supported by our international partners. In fact, a large share of that funding comes from international donors. So even though I’m looking at Ukrainian budget spending, I’m also tracking money provided by our international allies. And I want to see all of that — in real time. If we’ve already built one such project — do you think it would be that much harder to scale it to five, ten, or fifty? Sure, we’ll need to adjust for different nuances — but overall, it’s very achievable.

Final slide: This is a link to our podcast. Feel free to subscribe, listen, and send us your questions.
Thank you!