AWS has recently enhanced its Outposts offering by introducing the LagStatus metric, aimed at improving network observability for both first-generation and second-generation Outposts racks. This new metric allows users to monitor the operational status of link aggregation groups (LAGs), which are critical for maintaining seamless connectivity between AWS infrastructure and on-premises networks.
Understanding Link Aggregation
Link aggregation combines multiple physical Ethernet connections into a single logical link, known as a LAG. This approach not only increases bandwidth but also provides redundancy through fault-tolerant connections. In AWS Outposts, LAG connections are established between Outpost network devices (ONDs) and customer network devices (CNDs).
New Metric for Enhanced Visibility
The LagStatus metric, now available in Amazon CloudWatch, reports the operational status of LAG connections as either up (1) or down (0). It includes dimensions such as OutpostId and LagId, enabling quick identification of non-operational resources. This metric is crucial for understanding the health of hybrid infrastructure connectivity.
Combining Metrics for Actionable Insights
While LagStatus provides valuable insights into network connectivity, it is most effective when used alongside existing metrics like VifConnectionStatus and VifBgpSessionState. This combination allows for quicker troubleshooting and a more comprehensive understanding of connectivity issues.
Setting Up Monitoring and Alerts
To monitor these metrics effectively, users can create CloudWatch Composite Alarms. This involves defining individual alarms for each metric, which can be configured via the console, CLI, or AWS CloudFormation. Following best practices, IAM permissions should be restricted to the minimum necessary actions for alarm creation.
Connectivity Scenarios and Troubleshooting
The following table outlines potential connectivity issues and how they can be identified using the new LagStatus metric along with existing metrics:
| Issue | Identifying Metric |
|---|---|
| LAG is UP, but VIF BGP status is DOWN | LagStatus, VifBgpSessionState |
| LAG is DOWN | LagStatus |
Conclusion and Next Steps
The introduction of the LagStatus metric enhances the observability of AWS Outposts racks, providing users with critical insights into their network connectivity. This metric is available at no additional cost across all commercial AWS Regions and the AWS GovCloud Regions where Outposts racks are deployed.
For further details on Outposts rack networking patterns, refer to the Networking section of the Outposts High Availability Design and Architecture Considerations whitepaper. Users are encouraged to reach out to their AWS account team for more information on enhancing observability for Outposts.