With the Lead and Copper Rule Improvements (LCRI) baseline inventory due November 1, 2027, utilities are working to reconcile thousands of unknown service lines. Joanna Cummings, Kristin Epstein, and Sandy Kutzing, joined together for a panel discussion to share their experiences helping utilities nationwide solve this challenge with machine learning.
Practices
Priorities
Interview Card Row
Rachel Herbst
How does machine learning help utilities reduce unknowns, and why is this so critical for 2027 compliance?
Sandy Kutzing
Machine learning is a tool for turning unknowns into predicted materials, either lead or non-lead. The key insight is that unknowns are included in the replacement calculation under LCRI, so reducing unknowns directly impacts a utility's required replacement numbers.
Joanna Cummings
Let me give you a concrete example. Say a utility has 5,000 known lead service lines and 20,000 unknowns. They might reasonably estimate that 20% of those unknowns are actually lead, giving them a realistic total of 9,000 lead lines to replace. But you can't use that assumption in 2027 without machine learning backing it up.
Sandy Kutzing
Exactly. Without machine learning, the calculation treats all 25,000 lines (the 5,000 known plus all 20,000 unknowns) as requiring replacement. You're looking at replacing 25,000 service lines, or 2,500 per year, instead of the 9,000, or 900 per year, you actually expect to find. That's the nightmare scenario. It makes compliance virtually impossible because you're looking at replacing thousands of extra lead service lines that don’t actually exist.
Rachel Herbst
So this is especially critical for utilities with more unknowns than confirmed lead service lines?
Sandy Kutzing
Absolutely. If you have extensive lead service lines as a percentage of your system, it's not as big an issue because you can theoretically hit the annual rate anyway. But if you suspect your unknowns are mostly non-lead and you just haven't gone out to prove it yet, it’s going to be very challenging to stay in compliance since verifications of non-lead do not count towards meeting the annual replacement rate.
Joanna Cumming
Absolutely. If you have extensive lead service lines as a percentage of your system, it's not as big an issue because you can theoretically hit the annual rate anyway. But if you suspect your unknowns are mostly non-lead and you just haven't gone out to prove it yet, it’s going to be very challenging to stay in compliance since verifications of non-lead do not count towards meeting the annual replacement rate.
Rachel Herbst
Are there limitations? Is machine learning as effective for smaller systems?
Kristin Epstein
Machine learning is an iterative process, alongside continued field investigations. The model is periodically re-run with the latest field verification data to generate up-to-date predictions. It can be very helpful for smaller systems with unknowns, especially where there are more unknowns than staff can physically investigate by November 2027. Before hiring a potholer to investigate every unknown service line, consider what other inspection data can be leveraged and combined into your investigation data set.
For example, staff observe service lines during their repair work, code officials observe private sides during house calls, customers provide identifications through your on-line survey, and contractors observe material during main repairs or meter replacements during the work they are already doing for you. By harnessing these extra inspections, a smaller system can build their data set for modeling. Then potholes can fill the gaps where more data is needed to better inform the model. Overall, machine learning will reduce the number of costly, disruptive dedicated inventory inspections, making it cost effective even for smaller systems.
Rachel Herbst
How does machine learning help beyond just the replacement rate numbers?
Joanna Cummings
I think of it as formalizing utility staff's institutional knowledge. When I talk to utilities, operators usually say things like "there's lead in this area, not much lead in that area." You can't use that knowledge in your inventory in most states, but if you run a machine learning model with enough data, it should pick up on similar features that correlate with lead presence. It makes that operator knowledge visible and defensible to regulators.
Kristin Epstein
It's also useful for construction efficiency. If you are looking to start replacements, machine learning increases your hit rate for both potholing and replacement work, even if some predictions are still unconfirmed.
Sandy Kutzing
Some utilities worry about using a model to determine whether someone has lead or not. But you don't have to rely on it as your only source of information. You can run a model, use it in your inventory for 2027 to get the unknown numbers down, and still investigate those locations physically on your own schedule. It gives you more time to do those inspections without being penalized with the replacement rate calculation.
Joanna Cummings
It buys you time at the beginning and also can help with finishing at the end. When you think you've gotten most of the lead out and you're 80% confirmed in your system, machine learning can help you tackle that last 20%.
Rachel Herbst
What should utilities know about implementation timing?
Sandy Kutzing
With only two years remaining before the LCRI’s baseline inventory is due, utilities will likely need to do several hundred or thousand physical inspections to achieve initial confidence in their models. But that's still much more manageable than trying to inspect everything.
Joanna Cummings
The earlier you start, the better. You can begin building the model and using it to target field investigations before you fully commit to using it for inventory material identifications. And as you gather more data, you get more confident because the model keeps improving.
Sandy Kutzing
Model costs vary greatly depending on size and complexity, but assuming an average investment of $200,000 for a model with 50,000 service lines and with test pits at $500 per service line, as long as it's saving you from performing 400 test pits at some point, you're at break-even… and you can typically avoid a lot more test pits than that!
Rachel Herbst
Any final thoughts on why machine learning is critical for 2027 compliance?
Joanna Cummings
At the end of the day, if you have many unknowns (which typically would be greater than 5,000 but can be worth it for some smaller systems as well) machine learning is still your best available option. You'll need additional verifications, but it's the most cost-effective approach when you're facing that kind of scale.
Kristin Epstein
You will need to keep running and improving the model until it achieves the target metrics and shows stable performance over time, but it helps you avoid artificially large replacement rates early on, saving you time and money. It also focuses your efforts to get started on a replacement program, bolsters requests for funding, and helps prioritize project areas.
Sandy Kutzing
The key is understanding that you're still going to do physical inspections, at minimum for the initial model development and for the validation study. But you won’t need to physically inspect 100% of the unknown materials of your system, and that's what makes this approach so valuable for meeting the November 2027 requirements.
Related insights
Insight
How to talk about lead programs with the public
As part of the Lead and Copper Rule Improvements (LCRI), water utilities are required to communicate with residents and businesses more than ever before about lead in drinking water. We’ve compiled four best practices to effectively communicate with customers.
Insight
Is your utility ready for the LCRI? Lessons from Illinois's early experience
Illinois utilities have been operating under stricter lead regulations since 2022, and Chicago's former water commissioner, Andrea Cheng, PhD, PE, shares lessons on navigating new sampling requirements, boosting resident engagement, and coordinating complex inventory efforts.

