© 2021 Strange Loop
With a microservice architecture, one request can go through hundreds of network hops. Not one engineer can know the entire path of the request and all the services it went through. How can engineers infer how the system behaves? Metrics? Logging? These tools have their place, but neither of these inherently constructs a journey of the entire request. What if we want to optimize the overall request latency? Figure out how many additional hops the system will make by adding a new API call? I am here to talk about how distributed tracing tells a story about your system. I will go over how you can see the entire picture of what your system looks like, and with this data, make investigate and triage systematic issues, and make impactful, data-driven, performance optimizations to your system. I will go over what tracing does well and what it isn't meant for. I will also go over how we went about tracing at Lyft and lessons learned from our adoption process.
Lita is a senior software engineer at Lyft. She works on the Networking team, building out the tracing infrastructure and client network visibility at Lyft. Before that, she worked on building out the API infrastructure using Protocol Buffers, creating systems that would generate code and bring type safety to Lyft's polyglot microservice architecture. She has also wrote application software as well, decomposing SMS/phone logic out of the Lyft monolith and into a microservice, and adding features to Lyft Line.