Why are the leading utilities choosing machine learning for 2027 LCRI compliance?
With the Lead and Copper Rule Improvements (LCRI) baseline inventory due November 1, 2027, utilities are scrambling to reconcile thousands of unknown service lines. Joanna Cummings, Kristin Epstein, and Sandy Kutzing, joined together for a panel discussion to share their experiences helping utilities nationwide solve this challenge with machine learning.
How does machine learning help utilities reduce unknowns, and why is this so critical for 2027 compliance?
![]() |
Sandy: Machine learning is a tool for turning unknowns into predicted materials, either lead or non-lead. The key insight is that unknowns are included in the replacement calculation under LCRI, so reducing unknowns directly impacts a utility's required replacement numbers. |
![]() |
Joanna: Let me give you a concrete example. Say a utility has 5,000 known lead service lines and 20,000 unknowns. They might reasonably estimate that 20% of those unknowns are actually lead, giving them a realistic total of 9,000 lead lines to replace. But you can't use that assumption in 2027 without machine learning backing it up. |
![]() |
Sandy: Exactly. Without machine learning, the calculation treats all 25,000 lines (the 5,000 known plus all 20,000 unknowns) as requiring replacement. You're looking at replacing 25,000 service lines, or 2,500 per year, instead of the 9,000, or 900 per year, you actually expect to find. That's the nightmare scenario. It makes compliance virtually impossible because you're looking at replacing thousands of extra lead service lines that don’t actually exist. |
If you have over 5,000 unknowns and you think they're mostly non-lead, you absolutely want to be considering machine learning before 2027.
So this is especially critical for utilities with more unknowns than confirmed lead service lines?
![]() |
Sandy: Absolutely. If you have extensive lead service lines as a percentage of your system, it's not as big an issue because you can theoretically hit the annual rate anyway. But if you suspect your unknowns are mostly non-lead and you just haven't gone out to prove it yet, it’s going to be very challenging to stay in compliance since verifications of non-lead do not count towards meeting the annual replacement rate. |
![]() |
Joanna: Right, it's a much bigger problem if most of your unknowns aren't actually lead. Some utilities have unknown categories that are mostly non-lead, while others are mostly lead. If you know your unknowns are probably lead, it's not a big deal to call them unknowns. But if you know it's only a small percentage of lead and the remainder are likely to be non-lead, that's when you run into major problems. |
Are there limitations? Is machine learning as effective for smaller systems?
![]() |
Kristin: We generally say you need somewhere between 10-20% physical inspections to get a stable machine learning model in a varied, complex system, but for smaller systems it might require more than 20%. It's really about your overall system size. If you have a 50,000-service-line system and you've already done 45,000 field verifications, machine learning can help you tackle those last 5,000 unknowns. Conversely if you have 50,000 unknowns, you may only need to physically verify 5,000 to 10,000 to make a lot of predictions of the other 40,000 to 45,000 unknowns. In both cases, machine learning is very useful to reduce the amount of field verifications a utility needs to complete |
How does machine learning help beyond just the replacement rate numbers?
![]() |
Joanna: I think of it as formalizing utility staff's institutional knowledge. When I talk to utilities, operators usually say things like "there's lead in this area, not much lead in that area." You can't use that knowledge in your inventory in most states, but if you run a machine learning model with enough data, it should pick up on similar features that correlate with lead presence. It makes that operator knowledge visible and defensible to regulators. |
![]() |
Kristin: It's also useful for construction efficiency. If you are looking to start replacements, machine learning increases your hit rate for both potholing and replacement work, even if some predictions are still unconfirmed. |
![]() |
Sandy: Some utilities worry about using a model to determine whether someone has lead or not. But you don't have to rely on it as your only source of information. You can run a model, use it in your inventory for 2027 to get the unknown numbers down, and still investigate those locations physically on your own schedule. It gives you more time to do those inspections without being penalized with the replacement rate calculation. |
![]() |
Joanna: It buys you time at the beginning and helps with finishing at the end. When you think you've gotten most of the lead out and you're 80% confirmed in your system, machine learning can help you tackle that last 20%. |
No other approach matches machine learning's cost and speed to turn unknowns into categorized materials.
What should utilities know about implementation timing?
![]() |
Sandy: With only two years remaining before the LCRI’s baseline inventory is due, utilities will likely need to do several hundred or thousand physical inspections to achieve initial confidence in their models. But that's still much more manageable than trying to inspect everything. |
![]() |
Joanna: The earlier you start, the better. You can begin building the model and using it to target field investigations before you fully commit to using it for inventory material identifications. And as you gather more data, you get more confident because the model keeps improving. |
![]() |
Sandy: Model costs vary greatly depending on size and complexity, but assuming an average investment of $200,000 with test pits at $500 per service line, as long as it's saving you from performing 400 test pits at some point, you're at break-even… and you can typically avoid a lot more physical inspections than that. |
Machine learning helps save time and money by reducing unknowns upfront—and avoids inflating your replacement rate.
Any final thoughts on why machine learning is critical for 2027 compliance?
![]() |
Joanna: At the end of the day, if you have many unknowns (which typically would be greater than 5,000 but can be worth it for some smaller systems as well) machine learning is still your best available option. You'll need additional verifications, but it's the most cost-effective approach when you're facing that kind of scale. |
![]() |
Kristin: You will need to continue running and improving the model until it achieves the target metrics, but it helps you avoid artificially large replacement rates early on, saving you time and money. It also focuses your efforts to get started on a replacement program, bolsters requests for funding, and helps prioritize project areas. |
![]() |
Sandy: The key is understanding that you're still going to do physical inspections, at minimum for the initial model development and for the validation study . But you won’t need to physically inspect 100% of the unknown materials of your system, and that's what makes this approach so valuable for meeting the November 2027 requirements. |

Machine learning gives you more time to perform inspections without being penalized by the replacement rate calculation.

At the end of the day, if you have many unknowns, machine learning is your best available option.

Machine learning accelerates replacement programs, supports funding requests, and prioritizes projects.