Contact

Episode 29

Machine Learning Assurance with Andrew Clark - CTO of Monitaur

Andrew talks to us about machine learning (ML) and what auditors should consider regarding ML assurance. 

You can reach out to Andrew via the Monitaur website.


Links

Transcript

 

Narrator: 

 

Welcome to the assurance show. This podcast is for internal auditors and performance auditors. We discuss risk and data focused ideas that are relevant to assurance professionals. Your hosts are Conor McGarrity and Yusuf Moolla.

Yusuf: 

Today, we've got Andrew Clark on the show. Andrew is the CTO of Monitaur. I'm not going to explain exactly what they do and I'll leave that to Andrew to explain. But why this is important to auditors, is the work that Andrew does at Monitaur - and his background is largely audit the work that he does is focused on making sure that machine learning algorithms work appropriately and there is a reasonable level of integrity in those machine learning algorithms. So making sense of the black box, so to speak. Andrew, do you want to give an introduction, tell us a bit about your background and, what you now do with Monitaur.

Andrew: 

Sure, thank you. My background is pretty varied. Undergrad in accounting and worked as a financial auditor interning with EY and a publicly traded manufacturing company. Normal substantative type testing. and then after I graduated, I transitioned to IT audit and I went full-time with this company doing your regular Sox audits, compliance stuff, security audits. Normal IT auditing type things. And really started, this was back when, continuous assurance and data analytics were starting to become something in audit. Everybody was talking about these. The company I was working for had a CaseWare idea license and, no one had ever really used that before. So they turned me loose on being able to start working on some of the analytics. And I quickly found that CaseWare didn't work too well in our implementation. So through my undergrad, I had started taking some additional stats classes and econ classes and things like that and taught myself how to program in Python. So I kind of took what we had there, moved it over to Python and then started expanding our analytics test and integrated them across a bunch of different ERP systems and things like that to really start providing that continuous assurance. And I loved it. Auditing stuff was fine, but this is where I really, saw a passion, was doing more data analytics and the programming aspect. So I went back got a master's degree in data science and, focused during that program on using machine learning and like unsupervised machine learning for detecting fraudulent journal entry transactions. And, built, those types of things into that, company I was working for to our analytics program. And when I was looking at current regulations and just even standard practices for auditing models we have these very established practices for financial auditing, for IT auditing. We have standards, we have a process how we do these things. Model auditing at the time was primarily based on just summary statistics and summary plots and things like that. Without the same type of detail or re performance or end-to-end understandability that we had with IT audits and financial transactions. Like nobody would sign off on a balance sheet without doing substantive testing and adding it back together and like checking, why do we have these different transactions that are inside your checking account, for instance. That sort of verification is taken for granted in a financial and it audit that you have to do those steps. In a model audit, for some reason that really wasn't the case. It's very much looking at aggregate values and saying, okay, that looks good because a lot of the model builders are very technical people that are very much on the stats of things. And for some reason, they like to kind of smoke screen sometimes the regulators and the auditors looking at their models. So there's just like a different process that's not into end. I wrote some articles on how we should approach auditing machine learning and getting rid of like the black box and looking at the process. Because machine learning, it's not really a black box. It's plain multiplication in a lot of cases. And it's an understandable thing. Especially from a transparency and a systems perspective anyway, of what goes in, what goes out, even if you don't know exactly what's happening inside that model. But when you look at it as an end to end process, you can actually understand what's happening and re-perform in some cases. So I started doing a lot of research in that, wrote some papers, wrote a paper for ISACA on how to audit machine learning based off the CRISP-DM framework. I spoke at a couple of their conferences, talking about how to audit machine learning and how to set up a machine learning assurance program and kind of talking about these ideas of re-performance and verification and understanding that the whole system life cycle of the model. And Capital One found me at one of these conferences and said that's exactly the problem we're having. Come, help us set that up. I went to Capital One. Started the process for them on how to build a machine learning assurance program, how to audit models. Because they were really seeing this congruence. If you can have a model audit team that's going to look at the mathy parts and why you chose linear regression and why your coefficients are these values and stuff. But for the end user or the regulator, or the person that got denied for their insurance policy- knowing that the coefficients of your linear model are within the standard distribution that doesn't really do anything for you. That's not the same type of, well, I verify that your balance sheet balances, you know, there's a different level of assurance there. So we set that up for them, and also built a bunch of machine learning models and different analytics programs to help speed up their internal audit program. And then I wanted to get even deeper into building models. So I went and worked at a bespoke AI firm building machine learning models for decentralized systems. And then I was reached out to basically in a cold email, from my, my current co-founder that had read a lot of my work on, how to audit machine learning. He's a executive out of Boston who's has had a long career in different things and took a sabbatical from his, previous job. And it was like, Hey, I, I want to find some way to be inside machine learning and help this become a more accepted thing. We both had been separately working on this idea of machine learning assurance and providing verification and transparency and assurance around machine learning models. So that was the genesis of Monitaur. Providing machine learning assurance in regulated industries, to be able to use machine learning in these areas that would be high impact, but be able to meet these qualifications and things that you normally need, that would pass regular audits and allow people to feel comfortable with machine learning models, as they can feel comfortable with balance sheets and things like that.

Yusuf: 

What is it that drew you into evaluating machine learning models as a career full-time? What's the driving force.

Andrew: 

I love to be interdisciplinary. This is a space where I can still be in the audit world and also really interfaced with technology and different things like that. But there's a really big gap here. A lot of machine learning models in healthcare and insurance can really do things to help improve people's lives and get better care and better insurance, lower price insurance. And it's like accelerated underwriting products these days you can get life insurance within five minutes online. These types of things are really impactful and help individuals. And a lot of times you're not allowed to use these tools because regulators don't understand the models, which companies may not always be using the best practices. And there's really this gap on providing some standards and some best practices for how to address these models. So you can unleash all this innovation and help people. So that's really what it comes down to, is finding ways to drive innovation and help people and buy. Unleashing machine learning and AI, you know, Google can do a ton of stuff, but it's never gonna happen in the big healthcare companies and the insurance companies, unless there's some sort of assurance around it. So the angle of helping people is the driving force. And then of course there's so many fun tech problems and business problems around that, to me is really exciting and kind of intersects with a lot of my background and interest in, and it's like this perfect triangle.

Yusuf: 

Assuring machine learning models, right? So you're not actively involved in building. It's more around assuring. Is that right?

Andrew: 

Correct. As a company we are on assuring them. Personally, I build lots of models, I talk to lots of people that build models. Because first off it's fun and I like doing it in my spare time. But secondly it's, we need to know what's going on in models. What are the companies using? What are the latest innovations? So a company like Monitaur. So we know how to connect to these because we want to be platform agnostic and model agnostic, meaning that some, types of providers, you only work with like a model from IBM, for instance, or something, but for Monitaur, to be able to be this software solution that provides assurance around machine learning models. We have to be very in tune on what's going on. What's the latest thing Google did. What's the latest thing Amazon did in their open source libraries. So we know how to connect to these things, but as a company, no, you're right. We don't build models. It's all about knowing what the models are doing so we can assure them.

Yusuf: 

As you said, you need to understand what's going on, but there's a lot to be said for being, very specific in terms of focusing on assurance, because that means that you become very deep in that area. So as opposed to building a whole bunch of models and then having other clients where you're going to do assurance and having to work out, am I independent enough to go and assure a model that was built there because I was involved at some point. So it's quite important to have that separation and focus as well.

Andrew: 

Definitely. That was a huge point for us. unlike, you know, an EY or a KPMG or anybody, there's never a conflict of interest. We are trying to become the gold standard. You have Monitaur, that's a good housekeeping seal of approval for your model. So even though we're staying up to date on what modeling is doing, we are completely separated. We never help any company that we're talking to model in any stretch. We've never done any modeling work for any company. So we definitely want to keep that separation of church and state completely. There's not going to be any consulting and assuring in the same company with Monitaur.

Conor: 

You mentioned previously there the huge possibilities with machine learning in terms of, two sectors that being, the insurance sector and healthcare. Are they sectors of focus for Monitaur?

Andrew: 

Yes, for us those are two of the main focuses. Insurance is our primary focus and the medical industry is our second focus, specifically medical devices. So we haven't spent any time working in like ad tech or anything because they can really do whatever they want, unless you're violating like equal opportunity or any type of outright discrimination. There's not really regulations around it. Insurance and healthcare are two of the large companies that are trying to do things to impact people's lives. And they have active regulatory presence. Insurance, for example, in the United States and most countries, you cannot just randomly say here's a new life insurance product or health insurance, or any type of insurance that's based on a model and not have that model approved or that process approved. Same with if you had like a pacemaker or something like that. You're wanting to start applying, some sort of algorithm to. Those have to be approved by regulators. So we also interface with regulators to help them develop their thinking, help them with some of their standards. And then also we will then know how to help with the companies. Know what they're looking for because regulator speak and tech company speak are two separate things. So we're like the go-between that tries to speak both languages and interpret for each other. In the U S those are the two main markets that we've seen that are running up against regulatory burdens. Energy has, some interest right now the energy sector, but so far, they're not advanced enough to be running into that many model issues.

Conor: 

And do you see a growing appetite in the U.S. to get on top of the models being used by these regulated companies?

Andrew: 

Yes, there is definitely a lot of discussion amongst regulators, in both medical devices and the insurance industry. They both are very much trying to get their hands around it. They see the potential of machine learning. They want it, but for consumers and end users, need to feel comfortable that these controls are in place and there's no bias or discrimination. So they definitely are pro innovation. They just need to find a way to get their hands around it. And how can they feel comfortable? That's why we focus on those two spaces as the two emerging tip of the spear if you will. The companies we've been actively talking to are primarily in the U.S. But also we've talked a lot of individuals in Australia, as well as great Britain are the three main countries that we've been talking to.

Yusuf: 

In terms of who you would work with. You're obviously doing a lot of audit work. Would you work with auditors as well within the entities that you would be helping?

Andrew: 

Yes. So there's two main parts of Monitaur. It's a kind of a consulting service where we have our own risk and control matrix. And, here's the whole life cycle of how you would deploy a machine learning model and system responsibly and what steps and controls you need to have in place. And then we have our software component, which primarily provide the controls around change management and, monitoring for bias and recording your transactions for transparency and things like that. and eventually the software will be able to provide these certifications. and we will normally talk to the risk individuals say second line, for instance, and then we'll be, talking about like, Hey, we need to get this deployed. Then we'll talk to developers to actually get, the software deployed. But second line and third line are usually where we interface. internal audit may identify the problem, but you know, they can't implement anything. They can just say, Hey, there's a problem here. Second line or first line goes out and says, Hey, we need to find a way to fix this. And then that's when Monitaur usually gets brought in by first or second line.

Yusuf: 

Most of the audience that we're talking to here will be internal audit, who'll be looking to understand how to determine whether a machine learning model is appropriate and what they need to do in terms of auditing it. In your view, where are the problems? Are they primarily in the models or in the data that is being fed into those models?

Andrew: 

The short answer is it depends. We have a risk and control matrix that addresses the whole life cycle. And we can definitely talk to auditors and we have talked to many auditors about what they need to be looking for. In my experience, most of the problems in most companies, it's not going to be the model per se. There's of course, issues that can happen on the model side. But if you're going to make me choose one, I would say a lot of it has to do with the data and the process of how the whole system works. Because, to the point where I was making earlier, traditionally model auditing and models are separate than the system. You'll see, There's it stuff, they deploy something. And then there's the model part. And people look at them completely separate. You need to do an integrated audit in most cases to understand the full system, because the data where it came from, how it's processed before it touches that model, that's upstream changes that will affect the model directly. So your model can be perfect and the data could change or something like that happen. So you need to look at it as a full system. so there's not like one part or other that you need to be concerned about. You need to look at the whole life cycle and the whole ML system. I call it email system versus ML model. But if there's one part where I've seen the most issues, it would probably be around the data. Because bad data, bad data out, basically. that would be the most concerning spot I've seen across, companies we've talked to and I've worked with has been data problems with data governance, maybe bias in your data or just, overall not good data governance or lineage practices.

Narrator: 

The assurance show is produced by Risk Insights. We provide data focused advice, training and coaching to internal audit teams and performance audit teams. You can find out more about our work at datainaudit.com. Now back to the conversation.

Yusuf: 

If you were an auditor, if you had to go back to internal audit we deal with a range of performance audits as well, and performance audit is work in the public sector, similar to internal audit, except that they would do large, audits that usually focus on a particular subject matter and go across several entities within the public sector. So in the U S a number of States have large performance audit officers. Washington definitely does. California does, etc. And then at a federal level, you've got the U S GAO. they would do what they call performance audits. What's the main thing that you would want to look at if you were involved in a large internal audit performance audit, as it relates to machine learning assurance.

Andrew: 

at a high level, you want to look at, see that there's a full life cycle process of. when you make a model everybody's on the same page, there's good data understanding there's good data governance, you know, there's checks for bias, and you know, good change management in place, all the regular audit things, it audit, make sure all of those are applied and someone's looking at the system level versus an individual model. And you want to make sure you have the ability to do Reaper Formance like balance sheet checking, right? Being able to see, how did you get to this answer? A lot of times with machine learning models, you're just looking at these aggregate. Here's the summary of, we had 55% of transactions where, you have diabetes 45% said no, like, no, we need to actually see what the individual transactions are. So be able to understand how the system works. You want to be able to try and re do any. Any sort of performance and also you want to make sure there's some sort of monitoring in place for like drift that's happening in your model and knowing change management, those sorts of things. So definitely focused on the, on the life cycle and seeing if you can do a couple of key elements of, of understanding exactly the inputs and the outputs and ideally re-performing it.

Conor: 

When you say drift there, can you just explain what you mean by that?

Andrew: 

Yes. So in machine learning models, there's two types of drifts that can occur. You have model drift, meaning the model distribution of outcomes say that, like this example, I just said we had, diabetes say the, the training data and everything we were training on, we expecting between 40 and 60% of the transactions to come through to say you have diabetes, right. And the other, an Indian verse of that, they don't, well, what happens if you start seeing over the last week, we've had 90% of every transaction that's come in has been, You have diabetes? Well, something happened somewhere in your model because it's something upstream has happened because the expected distribution of outcomes is different. So what, what happened? Did we get a whole influx of different people coming in, or you could think of COVID happened for instance, like the whole game shifted and a lot of, a lot of models, the data that's coming in is completely different because you have some sort of macro event. So there's a distributional change. When you take that to feature drift is the other type of drift. So we know the outcome, the outcome has changed there. What we might see that there's one, one input that has changed such as, in this case, let's just say age traditionally, we've been having, for, for this model it age, you know, you normally don't want to include that in the model, but for this sake, we're saying this is like a medical device and that's a very key element of, cancer risk. Your diabetes has age is going to be a major contributing factor. There's no, discriminary. impact here. So we're going to say that this, this model is traditionally was between 20 and 40 year olds to determine if they had diabetes or not. Well, for some reason, something upstream has happened in your, regular system code. And now we're getting a bunch of 65 year olds that are coming into our model. So that feature, the distribution in that one feature of age has shifted significantly. And that's what shifting the model to, to outcome. So you want to be able to know when these changes are occurring because your model was built under a certain set of parameters for a certain use case. And you can say that it's unbiased under that use case of 20 to 40 year old individuals will, if something changed and you're only getting 65, plus your model may no longer be accurate or relevant. So, those are the types of drifts you want to be aware of, and you want to have some sort of controlling in place and be able to look and see when these are occurring. Because most times when people have bias or something bad happened, their model or performance degrades it's because of either a feature drift or a model drift.

Conor: 

Okay. So you mentioned there probably the first focus area for internal auditors. And so auditors would be to look at the full life cycle of coverage in place, from the system perspective. What's the next thing, perhaps they should focus on.

Andrew: 

If you've already looked at lifecycle and see that something's in place there, you definitely want to focus. As you've previously mentioned, on the data you really want to focus on, what are your data governance specifically, you know, is the data documented? Do you have good controls around the data acquisition? Are people looking at the data for bias? You know, there's ways to check. Great example recently it's an image classification model. Like, Amazon has used some in the past. You can say a lot of the primary data sets people experiment on have 90% or 80% of white people faces. Right? Nothing wrong with that. There's something wrong with that if you are trying to then apply that to a demographic that doesn't line up, for instance, a lot of these, face identifiers will be accurate if you're a white individual, but if you're an African-American individual, for instance, or Asian they're going to be extremely inaccurate because they were trained on the wrong data. So if your model is to identify differences between white people faces no problem. But if your model is supposed to identify differences between faces, will you have a major problem here? Your data needs to represent whatever demographic section. This is what I've talked about. When model drift happens, you need to know what is your model built for. So if we're just saying, we're trying to take a cross section of the American population, your data better not be 80% white is what I'm saying. You know what I mean? You need to have it balanced so it's representative. So if you're an auditor, you need to understand the context of what this model is. Trying to do. And then you're going to focus on, well, does this data match the demographic and the use case? I'm looking for same with how I said that model for diabetes, which is initially between 20 and 40 year olds. Well, for lining up data to look like 20 to 40 year olds, that's a different set of data than if we're looking at 65 plus. So you need to understand what is this model doing? What is the system doing? And making sure that your data is lining up with what your model and systems should be doing and making sure that the appropriate checks were in place. It just depends on the context. So number one is, you know, understand the life cycle. What is this model doing? all those sorts of things, and then understand does this data line up with that?

Yusuf: 

Yeah, and you can even go a little bit beyond that and this is probably something you guys get involved in often, but sometimes when you don't have a particular field or variable or indicator in your data, there might be another one that is indicative. So you may have no ethnicity listed in your data for example, but you may have a particular postal code or geographic area where there's known concentration of a particular ethnicity. And so if you're looking to determine whether, there is bias, you may need to look at other indicators than the ones that are directly related.

Andrew: 

Completely completely. You have to be aware of those proxies and have some sort of checks around those that would be part of, digging in farther on the data. And that's something that, as an auditor, you would need to know like, okay, you have the business understanding. Now you can even understand if the data representative of the population like we talked about next step would be understanding what are the proxies? And are we aware of those and other proxies lining up like zip code to a common thing that people will find that like, Oh, we're not being discriminatory on race or gender, but by the way, we're using zip code. So a lot of times, like you're saying you're using zip code. Well, guess what, that's probably going to be a little biased against certain, lower income groups, for instance. So you need to make sure you have checks, or if you have to use zip code such as something like routing or package systems may have to use of code. Well, great. You need to have explicit monitoring around zip code to make sure bias does not come into play.

Conor: 

What are some of the common challenges or questions that your clients put to you when you're working with them? Are there any common challenges that they face?

Andrew: 

Yes. Every client is different, but if I had to choose one overarching thing that I'm being asked a lot the last couple months. It's how do I know my model's not biased is like the number one thing in the U S on most people's minds at the moment, especially in the insurance sector is how do I know my model isn't biased. and you know, we've, we've talked about a lot of approaches so far in the last couple of minutes. That's one of those questions that it's like, well, it depends. And there's no one, bullet answer. using the full monitor platform, using all the risks controls, and like a monitoring solution. You can get pretty close within a degree of assurance, but bias is one of those things that it's a lot of moving pieces you got to make sure are working. Right? And most of the time with the steps we've already even talked about on this call, you're going to mitigate your bias. But there's not one magic bullet because I've heard people ask like, well, can I just have one metric that I keep an eye on? And I'll know everything's good to go. Sadly, no

Conor: 

Do you think that question arises because been an issue within that organization or are they generally just being proactive by asking that question or a mixture of that and other things.

Andrew: 

It's a mixture. There's a lot of, especially in the insurance and, in U S in general. And it probably the same in Australia. there's a lot of emphasis lately on, bias. because there's a lot of people that are uncomfortable with algorithms still, which is why, you know, monetary, we're trying to make you feel more comfortable, and provide controls. So you can be confident is a lot of people are scared that they don't trust machines. They don't trust the machines people are doing. They think that because it's a model making a decision, it's going to be biased. It's going to be, Against them. So like most of the time the news media is going to be finding those couple examples when a model goes wrong and then they're going to be blaring it everywhere. why it's wrong? so I think that's a major thing is that confirmation bias of you algorithm might be working great at a lot of places. You have one company that messes up and fan it's everywhere. So a lot of people have been hearing that a lot of companies including Apple, amazon so people are like, I don't want to be that next company. that's a big contributor to it as well as regulatory pressure around making sure your models are not biased and don't have any sort of disparity impact.

Yusuf: 

So talk about Australia being similar. In the insurance industry here, the general insurance code of conduct that was refreshed recently, actually makes specific mention of making sure that your, models are now subject to, independent review, regular independent review to ensure that there is no discrimination inherent in those models.

Andrew: 

No, that's good. That's good. But that's, what's costing a lot of companies to then be like, okay, drat, what do I do? What do I do? And as executives. They wanted to have, what can they rely on to have a KPI that they can keep an eye on? And. That's hard. that's a problem we're actively working on. And as an industry, everyone's working on that, but it really comes down to you. There's no quick fixes. You got to do the regular blocking and tackling that in, you know, in finance and even it auditors have good controls. And they've been doing this for awhile, as we're talking about here, like, this is a pretty new concept of taking, you know, back office model, review techniques that are aggregated and then trying to move it to these like forefront decisions, mission-critical type applications and that's where there's a big shift occurring.

Yusuf: 

Most of what we've been talking about has been where there's a machine learning program or machine learning model in place already. Should auditors, or even risk professionals, be looking to help their organizations identify opportunities to use models or better use models, better use machine learning, to improve efficiency, effectiveness. Because a lot of what audit is focused on is there's a machine learning model or there's some sort of algorithm running that I need to get across and make sure is working properly and that isn't biased and I can trust it. That it's transparent, etc, etc. What about the flip side where, as auditors we need to be providing positive outcomes in the form of helping improve efficiency, helping improve effectiveness. And there, it would be, where, opportunities are identified and there's potential to use machine learning, use a model, then we should be recommending that. Have you seen any of that? Is that something that's been on your radar at all?

Andrew: 

Yes, definitely. for thinking audit department should definitely be that trusted advisor role that we talk about a lot in the audit writing profession. That if you see like, Hey, this would be a good opportunity. Definitely. Like it's not an audit report. But, you know, recommendations on, on doing those things. I definitely think auditors should lean into that. just make sure before they do so that they're very much abreast of what would be best practices. What's a good process in place to start machine learning and things like that. Advice is always better received if there's plan in place, and good ways to do it because if an auditor says like, Hey, this would be a lot better. If we could do this machine learning in this area and make it more, more efficient, if they understand what the relevant regulations are and how they can help guide the business to do this in a responsible way. Their stock is just going to go way up in the internal audit department, because the last thing you want to do is recommend something and then not, assuring it correctly. But auditor should be on top of what machine learning is, how it works, what are good controls in place, and definitely start identifying in your organization where you can start improving processes and things, because that just makes internal audit roles, a trusted advisor and a valuable, business partner just keep going up.

Conor: 

You've done a lot of study over the years and professional development. Just interested to understand what's next for you in terms of that ongoing professional development?

Andrew: 

I'm currently wrapping up a part-time PhD in economics I've been working on for several years. Taking computational, things like where can we apply machine learning to economics? So that's been kind of a continuous education aspect for me. That's been really good at just like enhancing my research chops after I finished that up, I'll start doing a little bit more about applied machine learning, research around bias I'm a certified AWS, associate for architecting, like how to build a, you know, SAAS solution type thing on AWS. And I'll probably pursue a few more AWS certifications, maybe the professional grade and just keep reading anything I can get my hands on about machine learning and modeling and, and assurance.

Yusuf: 

Excellent resources on your website. A newsletter that we subscribe to and try to consume as often as we get time to. Lots of interesting information in there. What's the best way for people to get hold of you

Andrew: 

Go to our website. We have a lot of information on there. We have a contact form where you can also download our white paper that talks about machine learning assurance, provides some background. We have some helpful resources there. I monitor the contact form daily. Just drop us a line there. More than happy to have a one-on-one with you if you want. We can talk about different ways to learn about machine learning. I'd love to hear from anybody that wants to learn more about machine learning assurance or Monitaur. But monitaur.ai is our website and it's a great jumping off spot.

Yusuf: 

We'll put a link in the show notes to make it easy for people to just click and find you. Thank you very much for joining us today. Good insights in that conversation for internal auditors and performance auditors. So thank you for that.

Andrew: 

Thank you so much for the opportunity to talk to you guys and being on the podcast. I really enjoyed joining it and look forward to being in touch soon.

Conor: 

Excellent. Thank you, Andrew.

Narrator: 

If you enjoyed this podcast, please share with a friend and rate us in your podcast app. For immediate notification of new episodes, you can subscribe at assuranceshow.com. The link is in the show notes.

<< Previous

28. David Haylor - specialist Internal Audit search and selection

Subscribe

Next >>

30. John Moore - CTO of Queensland Audit Office