June 13, 2025

How max_binlog_cache_size Mismatch Broke Our Cluster

A critical case study where mismatched max_binlog_cache_size values between primary and replica caused replication failure, with actionable solutions.

Incident Overview

Our DMP monitoring platform alerted a replication failure with SQL thread stopped. Diagnostic commands revealed:

SHOW SLAVE STATUS\G;
-- Error: Worker 1 failed executing transaction '44bbb836-...'
-- Error_code: 1197 (max_binlog_cache_size exceeded)

SELECT * FROM performance_schema.replication_applier_status_by_worker;

‍

Critical Finding:
Primary's max_binlog_cache_size: 10GB
Replica's max_binlog_cache_size: 10MB

Technical Deep Dive

Key Parameters

max_binlog_cache_sizeMaximum total cache for all client binlogs‍
binlog_cache_sizePer-client binlog cache allocation

Why It Failed:
A multi-statement transaction required >10MB binlog space at replica, while primary allowed 10GB.

Resolution Steps

1. Immediate Fix:

SET GLOBAL max_binlog_cache_size=10240000000; -- Match primary's 10GB
START SLAVE;

‍

2. Prevention Checklist:

Audit parameter consistency cluster-wide
Monitor binlog usage trends
Set alerts for replication errors 1197

Key Takeaways

Consistency Matters: Always verify parameter parity across replication nodes
Monitor Proactively: Track binlog growth for large transactions
Dynamic Adjustment: Know which parameters can be changed online

Critical Warning:
Mismatched binlog settings can silently break replication during large transactions!

‍

You will get best features of ChatDBA

Try Free Trial Learn More

How max_binlog_cache_size Mismatch Broke Our Cluster

Incident Overview

Technical Deep Dive

Key Parameters

Resolution Steps

Key Takeaways

More From Blog

Mastering SQL_MODE in MySQL: A Comprehensive Guide

What Should Be Done for MySQL Database Inspection?

The Most Comprehensive Interpretation of New Features in MySQL 8.0 - Part 1

You will get best features of ChatDBA