Coding diary

Short logs from day-to-day engineering work: things learned, bugs tracked down, small wins, and practical notes worth keeping.

Mar 5, 2026

Learning loop check-in

Reviewed the last 18 months of projects and writing to identify reliability skill gaps.

Feb 28, 2026

Observability gap incident replay

Replayed old incident logs and found one missing business-level metric.

Feb 2, 2026

AI-assisted infra review boundaries

Defined where AI review helps and where human operational judgment must remain final.

Jan 9, 2026

Kubernetes resource budget review

Default resource requests were outdated for two services after traffic growth.

Dec 3, 2025

Postmortem writing rubric

Drafted a simple rubric: timeline clarity, impact clarity, remediation quality.

Oct 26, 2025

Terraform module boundary refactor

Reduced hidden coupling between infra modules to make changes safer.

Sep 17, 2025

Runbook gap found during outage sim

Outage simulation revealed missing rollback preconditions in runbook docs.

Aug 2, 2025

Alert fatigue priority reset

Introduced severity + ownership tags so escalation paths are clearer.

Jun 21, 2025

Go CLI for on-call checks

Built a small CLI to validate service dependencies before high-risk deploys.

Check out →

May 9, 2025

API rate limit policy adjustment

Tightened noisy endpoint limits and added safer burst controls.

Apr 4, 2025

CI/CD rollback drill

Ran rollback rehearsal to validate deployment recovery under time pressure.

Mar 18, 2025

Dashboard signal cleanup

Removed vanity graphs and kept only operationally actionable metrics.

Feb 27, 2025

Incident timeline format template

Created a reusable timeline template so postmortems start with stronger factual structure.

Feb 10, 2025

Kubernetes probe audit

Found services using liveness probes as readiness checks.

Jan 24, 2025

Role-based admin guardrail

Reduced high-risk admin actions by splitting privileges by workflow stage.

Jan 3, 2025

Background job timeout review

Jobs were retrying longer than business relevance windows.

Dec 12, 2024

CI cache invalidation lesson

A stale dependency cache produced misleading green builds.

Nov 28, 2024

AWS IAM mistake that taught me scope

Overly broad policy worked, but was operationally unsafe; moved to tighter role scoping.

Nov 5, 2024

Loki log structure cleanup

Structured JSON logs made incident triage much faster than free-form text logs.

Oct 19, 2024

Deploy checklist v1

Started a pre-deploy checklist to reduce rushed release mistakes.

Oct 1, 2024

Payment retry edge case

A duplicate webhook recreated a completed payment path; idempotency key handling fixed it.

Sep 14, 2024

First week with Prometheus alerting

Learned quickly that too many low-quality alerts are as risky as no alerts.