observability tools
Infrastructure Monitoring with Mark Carter
At Google, the job of a site reliability engineer involves building tools to automate infrastructure operations. If a server crashes, there is automation in place to create a new server.