Inside Engineering: Adapting Agile for Data Science R&D Teams

7 min read

February 6, 2019

In this article

This is some text inside of a div block.

As we created and continue to improve Invoca’s award-winning Signal AI conversation analytics solution, we found that we had to adapt our previous agile scrum approach to accommodate a new kind of R&D so we could deploy and scale new AI capabilities. In this three-part blog series, we’ll show you how we adapted scrumban-based agile processes for data science R&D, how to implement and expand data science pipelines, and how to hire smarter for data science teams. Over the next three weeks, you’ll see how we accomplished this at Invoca and learn some tactics that you can apply to your data science organization.

Finding the Right Agile Structure

Data science requires a different approach to agile than standard-issue implementation because the process has pockets of high-risk of failure as you’re working on groundbreaking features your customers did not even realize were possible. This is quite different than when customers request a feature and product managers scope it, and since it’s usually a short step from the current capability, the risk is relatively low.

As our team has grown, we’ve tried numerous variations on the agile software team structure and process. We’ve used sprints, cycles, and other release strategies, but once we hit the point where daily code deploys became the norm, we found much of that structure was unnecessary. Now, our scrum teams primarily take on epics, which contain a set of stories, the sum of which encapsulates a shippable piece of either definable customer value or essential platform improvements. This allows us to focus more on the health and cadence of each epic and less on the artificial rhythm of the agile process.

How Invoca Does Agile for Data Science

On the implementation team, we have daily 15 minute stand-up meetings to track progress and ascertain whether an epic is on track or not. The essential information in this stand-up is: status for yesterday and today, is the team member blocked on something, and if anyone needs help.

For the data science team, it’s a little different. We use 15 minute stand-ups for data science as well, and we cover the same essentials. However, the JIRA boards (which we use for tracking project “tickets”) are quite different in structure:

Compare this, for the implementation team I manage:

To what we use for the data science R&D team:

Our research team performs a combination of blue-sky research and customer-request driven research. Typically, the customer-request driven research is shorter in scope, and those tickets move through the first four columns on our scrumban board, then move to closed when the results are communicated back to the customer.

Blue-sky research goes through the first four stages as well, but when it hits “results ready”, then it’s time for a team discussion with the product manager about whether the results, resources, and reliability of the new capability warrant prepping it for release into production. If it’s a green light, then we have a fast and efficient pipeline for doing so. (We’ll cover pipelines in the next post.)

R&D is at its best and most fulfilling when it gets turned into usable product. Of course, some R&D efforts will not get productized, but ideally that rate can be minimized. We also strive to create a work environment on our teams where folks are excited to come into work every day. I try to minimize the amount of time spent “eating sand”, which I find to be an effective catch-all phrase for activities that impede developer or data scientist happiness. The question then, is how to minimize sand consumption while maximizing output and innovation?

To accomplish those noble goals, we decided on the following team structure and agile process:

Teams share a product manager. Sharing a product manager ensures that the distance between our customers’ needs and our data science team’s research is as short as possible.
Research is directed into areas where our customers have asked for new capabilities. Customer feedback continues to be utilized during prototyping within R&D.
Data scientists prototype in python, using a hand-off process with the developer team to get the prototype python code ready for and into production efficiently.
By sharing both a product manager and an engineering manager, priorities between the two teams are easily aligned and the loop between ideas, prototype, customer feedback, and implementation is kept fast and seamless.
We also created an official data science R&D scrum team, and it works in a “tight loop” process with a developer-driven implementation scrum team.

The Three Keys to Winning with Agile in Data Science

Weekly Reviews: Hold a weekly data science review with all teams, where members of the data science team present the results of what they’re working on. While the manager still gets daily updates, the weekly review meeting is perfect for making sure there’s not too much drift between the concepts the R&D team is working on and the scope of awareness the dev team is working with.

Take a Deeper Dive: There’s no time to get off topic in stand-ups, but you still need time talk, reflect, and share with the team. To accomplish this, hold a weekly data science deep dive just for the R&D team, where you can get into complex concepts and discoveries. For instance, we may choose to drill in on the characteristics of a particular distribution type such as LaPlace, Power Law, or Dirichlet, and why one of them might be the preferred choice as a component in a hierarchical or ensemble Bayesian model. Conversely, we may dive into the particulars of a specific neural network type and the vagaries of its activation functions. The weekly cross-team review audience doesn’t need quite this level of depth, but for the growth of the R&D team members, both for mentoring and learning, it’s essential.

Set the Tone: This is a little more subtle, but possibly the most important concept in this post. If you want innovation and creativity to thrive, you must create the right environment. And that includes making space to fail, or to allow a team member to clear their plate for a week straight and just focus on a single topic. Fostering and demonstrating flexibility, openness, and curiosity as a manager enables your team to do amazing things.

An adjusted agile R&D process combined with excellent hiring practices, well-conceived data access, and an efficient data science code release pipeline can put your team on the fast-track for delivering innovative machine learning powered software. Stay tuned for more on the last two points!

Like what you see? Check out our career openings!

Subscribe to the Invoca Blog

Get the latest on AI and conversation intelligence delivered to your inbox.

Inside Engineering: Adapting Agile for Data Science R&D Teams

Finding the Right Agile Structure

How Invoca Does Agile for Data Science

The Three Keys to Winning with Agile in Data Science

Subscribe to the Invoca Blog

Thank you for subscribing!

Your inbox just got a lot more interesting.