BOSTON – Etsy is one of the Web’s biggest marketplaces. Its developers may be one of Web’s busiest teams.
Proudly, the vintage and homemade goods online store, will push code to production upwards of 50 times a day. And, according to Kenneth Lee, senior product security engineer, they do so with confidence they’re not going to break the site.
Lee explained during a talk Tuesday afternoon at Source Boston how Etsy has embraced a number of DevOps principles, in particular the marriage of development and monitoring processes, in order to push bug fixes, patches and feature enhancements.
Etsy relies on what it calls Feature Flags, code wrappers that allow security engineers to easily find particular functionality in the code tree, fix it if necessary, and roll it out incrementally to specific segments of Etsy users while determining how it will impact site availability and performance.
“We use them in development, QA and production,” Lee said. “Having code that uses feature flags gives you the ability, from an application security perspective, to easily find where interesting code is being utilized. When new functionality is ramped up to the website and we need to find it, it takes five seconds of grepping to find where it’s being used.”
Particular changes can be rolled out slowly and to certain users, such as to only one percent or 10 percent of buyers or sellers. Adding Feature Flags to old, legacy code also gives security engineers the ability to add logging tags that were previously left off.
“You need to be on top of your logging game to take advantage of Feature Flags,” Lee said. “With old features with no logging in place, when have to write a fix, you can add logging lines so you’ll have that awareness for future alerting and logging purposes.
“We always deploy with confidence,” Lee said. “With Feature Flags, we’re never forced into a scenario where it’s all or nothing when pushing out a security fix. Feature Flags give you the flexibility to make a decision of whether to ramp it up to five percent or 50 percent of users to see if anything breaks.”
The team also wrote a Web-based tool for its developers called Supergrep which calls out any lines of code as they’re logged that could be anomalous. Developers can see these unusual log patterns pop up as changes are made.
“Supergrep gives developers context. By having context, developers can filter out noise in things you expect to see in logs that’s OK versus what’s not OK,” Lee said.
This approach and ability to continue to evaluate a patch as it is rolled out incrementally is crucial because it also helps with deployments of high-priority patches. For example, Lee said, a vulnerability may be rated severe, but if it has not been exploited, there’s time for additional evaluation of logs to determine whether any activity on the network is taking advantage of it.
“It’s a powerful thing to say we can fix it today or wait until Monday at 9 a.m.,” Lee said. “If we write a patch, with Feature Flags, we can push out code and that doesn’t mean it’s on. By having a slow ramp up approach, you get the best of both worlds and ramp up slowly so you don’t take down the whole site.”