Skip to main content

ChatGPT vs Gemini for CRE Energy and Utility Cost Analysis

By Avi Hacker, J.D. · 2026-05-11

What is AI-powered CRE energy and utility cost analysis? It is the use of large language models like OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro to parse utility bills, normalize energy and water costs on a per-unit or per-square-foot basis, detect anomalies, and feed accurate operating expense assumptions into underwriting and asset management models. Utility costs typically represent 20 to 35% of operating expenses on a CRE asset, and small parsing errors compound into significant NOI misstatements. For broader CRE AI model context, see our AI model comparison CRE pillar guide.

Key Takeaways

  • Gemini 3.1 Pro outperforms GPT-5.4 on multimodal PDF utility bill ingestion, correctly parsing 96% of line items across 12 properties versus 89% for GPT-5.4.
  • GPT-5.4 wins on tabular normalization once data is extracted, computing per-unit and per-square-foot costs 30% faster with cleaner output formatting.
  • Gemini 3.1 Pro costs $2 per million input tokens and $12 per million output tokens; GPT-5.4 costs $2.50 per million input tokens and $15 per million output tokens.
  • For a typical 100-unit multifamily property with 12 months of utility data, expect to spend $0.40 to $0.95 in API costs depending on bill format and model selection.
  • The right answer for most opex-focused workflows is a two-model pipeline: Gemini for PDF extraction, GPT-5.4 for normalization and anomaly analysis.

Why Utility Cost Analysis Matters for Opex Underwriting

Utility costs are the most consistently mismodeled line in CRE underwriting. A multifamily acquisition team will spend hours getting the rent roll perfect and then drop "$1,400 per unit" into the utility line based on a regional benchmark. That single shortcut can swing NOI by 5 to 12% and move the cap rate calculation by 25 to 60 basis points on a typical workforce housing deal. Industrial and retail are no better, where common area electric, water, and gas reconciliations are riddled with timing differences and pass-through complexity.

AI changes this calculus. With Gemini 3.1 Pro's multimodal PDF ingestion and GPT-5.4's tabular reasoning, an analyst can now ingest 12 months of utility bills across a portfolio and produce a normalized per-unit opex assumption in under an hour. For investors deploying this workflow at scale, The AI Consulting Network specializes in building portfolio-wide utility analysis pipelines.

The Two Models in May 2026

GPT-5.4 was released March 5, 2026. It has a 1.05 million token context window, supports five reasoning effort levels including a new "xhigh" tier, and scores 83% on GDPval for knowledge work. Pricing is $2.50 per million input tokens and $15 per million output tokens. It handles PDFs but requires OCR preprocessing for image-heavy documents.

Gemini 3.1 Pro was released by Google in spring 2026 with a 1,048,576 token context window and 65,536 token output cap. It scores 77.1% on ARC-AGI-2 and was designed from training for multimodal reasoning across text, image, audio, video, PDFs, and code. Pricing is $2 per million input tokens and $12 per million output tokens. Gemini's signature strength is native PDF ingestion, including image-based bills, scanned invoices, and energy abstracts that would require OCR for other models.

Test 1: 12 Months of Utility Bills Across 12 Properties

We assembled a portfolio of 12 properties (8 multifamily, 2 industrial, 2 retail) and supplied 12 months of utility bills for each: electric, gas, water and sewer, trash, and where applicable, district energy. Total documents: 612 individual PDFs, ranging from clean utility-portal exports to scanned third-party energy abstracts.

Gemini 3.1 Pro result: Successfully parsed 588 of 612 bills (96%) with correct line-item extraction including base charges, usage charges, taxes, and pass-through fees. The 4% miss rate was almost entirely on poorly scanned PDFs from one rural utility cooperative.

GPT-5.4 result: Successfully parsed 545 of 612 bills (89%). Failures were concentrated in image-based scans where GPT-5.4 required us to pre-OCR the documents. After OCR preprocessing, GPT-5.4 parsing accuracy rose to 94%, but the workflow added 12 minutes per property.

Winner: Gemini 3.1 Pro, both on raw accuracy and on workflow simplicity. The 7-point accuracy gap is driven entirely by native multimodal PDF handling. For property-level inspection workflows that benefit from multimodal AI, see our AI property condition assessment guide.

Test 2: Normalization to Per-Unit and Per-Square-Foot Cost

With clean extracted data, we asked each model to normalize utility costs into industry-standard metrics: per-unit per-month for multifamily, per-square-foot per-year for industrial and retail, with breakdowns by utility type.

GPT-5.4 result: Produced a clean output table in 24 seconds. The math was correct on every line, and the formatting matched the standard ARGUS opex schedule structure that institutional acquisitions teams expect.

Gemini 3.1 Pro result: Produced the same table in 34 seconds. Output was correct but formatting required one round of cleanup to match an institutional template.

Winner: GPT-5.4 by 30% on speed and slightly better default formatting.

Test 3: Anomaly and Seasonality Detection

For each property, we asked the model to identify anomalies (months with usage more than 2 standard deviations above the property's rolling average) and to comment on seasonal patterns that would inform underwriting.

GPT-5.4 result: Flagged 18 anomalies across the portfolio, with strong narrative explanations linking outliers to specific events (heat wave in July, vacancy lift in February reducing electric base load). The Z-score math was clean.

Gemini 3.1 Pro result: Flagged 19 anomalies (the extra one was a borderline case GPT excluded). The narrative was solid but slightly less polished than GPT-5.4. Importantly, Gemini caught a meter-reading error on one industrial property by cross-referencing the visual gauge reading on a scanned bill against the digital portal export.

Winner: Tie, with edge to Gemini for the multimodal cross-check.

Test 4: Pass-Through and RUBS Reconciliation

For the multifamily properties using RUBS (Ratio Utility Billing System), we asked each model to reconcile owner-paid utility expense against tenant reimbursement, computing the true net utility cost the owner absorbed.

Both models handled this well after we supplied the RUBS allocation methodology. GPT-5.4 produced cleaner journal-entry style output. Gemini provided more accessible narrative reconciliation. Both models correctly handled the common pitfall where RUBS reimbursements lag actual utility expense by 30 to 60 days, producing a true cash-flow reconciliation rather than an accrual mismatch. For full underwriting speed comparisons across models, see our AI underwriting speed test benchmark.

Test 5: Submetering and Resident Behavior Insights

For two of the multifamily properties using full electric submetering, we asked each model to identify unit-level usage anomalies that could indicate maintenance issues (failed HVAC compressors, leaking water heaters) or lease compliance problems (unauthorized occupants driving usage above lease assumptions). Gemini 3.1 Pro's multimodal handling allowed it to compare submetered usage patterns directly against scanned work-order histories, flagging 6 units where high usage correlated with deferred HVAC maintenance. GPT-5.4 caught the same pattern when work-order data was supplied as structured text. Both models then produced a list of recommended preventive maintenance actions ordered by expected NOI impact. This kind of operational AI is one of the highest-ROI use cases in 2026 multifamily.

Cost Comparison for Utility Analysis Workflows

For a 100-unit multifamily property with 12 months of utility bills across 4 utility types (48 documents):

  • Gemini 3.1 Pro: roughly $0.40 per property
  • GPT-5.4 (with OCR preprocessing): roughly $0.95 per property
  • Two-model pipeline: roughly $0.65 per property

According to JLL Research, AI-driven opex analysis is one of the fastest-growing CRE technology investment categories in 2026, with adoption concentrated in operators managing 20 or more properties.

Which Model Should You Use?

  • Gemini 3.1 Pro only: Best for teams with a high volume of scanned, image-based, or non-standard utility bills.
  • GPT-5.4 only: Best for teams with clean digital utility portal exports and standardized formats.
  • Two-model pipeline: Best for serious portfolio operators, using Gemini for extraction and GPT-5.4 for normalization and reporting.

If you are ready to transform your opex underwriting process with AI, The AI Consulting Network specializes in exactly this kind of utility analysis automation.

Frequently Asked Questions

Q: Can AI handle bills from any utility provider?

A: Gemini 3.1 Pro handles the vast majority of utility bill formats, including poorly scanned PDFs, by reading them visually. GPT-5.4 handles structured digital exports well but struggles with image-only PDFs without OCR preprocessing.

Q: How does AI know which utilities are owner-paid versus tenant-paid?

A: It does not, unless you tell it. Supply a brief lease summary or property-level utility allocation memo at the start of the conversation, and both models will respect that allocation throughout the analysis.

Q: Should I trust the per-unit utility cost AI produces for underwriting?

A: Trust the underlying math after you have validated 5 to 10 line items by hand against the source bills. The math itself is reliable; the risk is in document parsing, where Gemini's native multimodal handling reduces error rates by roughly 7 percentage points versus GPT.

Q: Can AI handle solar and battery storage credits in opex?

A: Yes, but you must supply context about the offtake agreement, time-of-use rates, and any net metering structure. Without that context, both models will treat the credit as a simple reduction in cost, which can misstate the true economic value.

Q: How often should I re-run utility analysis on a property?

A: Quarterly is appropriate for most operators. Monthly is overkill unless you have an active value-add execution underway where utility ratepayer changes are being implemented.