Failure Notifications
The following are recommended approaches for how to notify internal and external stakeholders of failures.
Considerations for Failure Notifications
It’s important to ensure stakeholders are properly notified of failure events. There are some foundational best practices, but who receives which notifications when is dependent on a number of considerations. These are described below.
Customer Experience Design
The first and most important factor in determining who gets notified of which failure notifications is the customer experience that you intend to provide.
This includes factors like:
- How white-glove is your customer service model?
- How technical is your average customer? How much do you expose technical concerns to them?
- How customized is each customer? Is customization through basic configuration or a via process more like software development?
- What role does your support team play when interacting with a customer? What about your implementation/onboarding team?
As a general rule, simpler, lower priced, and/or more self-service SaaS offerings should err on the side of alerting customers of failures in simple, aggregated terms.
These customers are likely not interested in or capable of working through particular details of what might be a complex integration on the backend. This customer cohort likely shares identical or very minimally configured integrations, which means simplifying their notifications by making them universal can be achieved.
In some cases, the right answer may be that this cohort of customers doesn’t ever receive failure notifications regarding an integration and all notifications should be kept internal. This would mean the operations team is responsible for finding and addressing issues while communication to the customer is manual, likely through a support ro account management channel.
If the customer experience expectation is that customers implement highly configured or bespoke integrations, these rules apply less globally. Expectations should be set in the software license agreements, EULAs, and other contractual agreements with the customer. These may define a similar white-glove approach, despite the complex integration. They may push some or all responsibility or ownership to the customer.
In the latter case, the customer must have more visibility to integration failures so they can be acted upon. In both cases, failure notifications are probably at least somewhat unique per customer implementation.
Defining your desired customer experience is also important for designing internal alerting and corresponding escalation processes.
What information will the people who interact with customers require? How much interaction do customers expect? Are they expected to proactively support or only reactively support?
Ensuring that internal stakeholders receive the right failure notifications is important when supporting these objectives. (It’s likely internal stakeholders should be provided more detailed information than external stakeholders.)
Contractual Obligations and SLAs
Consider the contractual obligations you have with your customers, including service level agreements. It’s likely that your notification strategy will be a key requirement for executing on those obligations.
For software companies that aren’t integration companies at their core, contracts may not be specific about your obligations to the customer as they relate to your integration. In those cases, the contractual obligations must be more broadly interpreted to understand how they apply to integrations.
In these cases, it’s best to lean toward more failure notifications and more training on those notifications for the internal stakeholders who interact most closely with customers. Likewise, it’s best to lean toward minimal notification and only at the aggregate directly to customers to help set proper expectations for what might be unspoken obligations.
These scenarios vary from team to team and product to product, so consider them guiding principles more than a prescriptive approach.
Operations Team Organization and Processes
You should also consider how your teams are organized around building products and supporting customers on those products.
Is your operations team separate from your development team? Does your support and/or operations team have a tiered structure for escalating issues to senior engineers? What level of triage do you expect customers or partners to perform before escalation?
These factors contribute to deciding who should receive what notification. For stakeholders with direct customer contact, this overlaps with the requirements for your desired customer experience.
Notifying Internal Stakeholders
The following are a set of principles for how you notify internal stakeholders of failures differently than external stakeholders.
- In general, internal stakeholders require as much information as possible about a failure and the format in which that information gets presented is less important.
- Internal stakeholders can be provided details that assume or communicate internal knowledge of your product architecture.
- Internal stakeholders must absolutely be concerned about infrastructure and application failures.
- The extent to which internal stakeholders are concerned about endpoint or data failures depends on obligations to the customer.
- For an internal view of the health of the integration portfolio, internal stakeholders need broad, aggregated information and specific failures at specific data points.
- Internal stakeholders are often notified by ticketing workflows and other processes that are adjacent to any monitoring that is occurring.
Notifying External Stakeholders
The following are a set of principles for how you notify external stakeholders of failures differently than internal stakeholders.
- External stakeholders must be notified of integration failures to a degree and through a channel that matches their customer experience. That could include a high level of detail or very little detail. It could all happen behind the scenes.
- External stakeholders aren’t primarily concerned with the health of an integration. They are concerned with how its health impacts their business operations.
- You should be cautious of exposing too much information about the internal architecture of your application or integration framework when notifying external stakeholders.