![]() ![]() This scaling up and down leads to a seamless experience for your engineers and end users.Ĭreate a CPU attack using a CPU utilization percentage and attack duration above your autoscaling policy thresholds. The scaling happens automatically based on policies you set, and if the load drops, it can save you money by shutting down the instances when you no longer need them. Autoscaling helps to combat this very thing by recognizing high CPU load on your service and spinning up extra instances that will handle the spikes in traffic. Huge spikes in traffic from link aggregation sites or holiday shopping can lead to your website or service being hugged to death. Monitor the attack on your dashboard and ensure an alert is triggered. ![]() Set the CPU % utilization to a level above your alerting threshold, and set the attack length long enough to trigger the alert. Observe the impact in your monitoring dashboards, and receive alerts from paging platforms to verify they work as expected. To test your monitoring and alerting functionalities and policies, run CPU attacks to stress your infrastructure and services. Observability into the state of your infrastructure, and being alerted when it’s outside of normal operations, are basic requirements in maintaining your Service Level Objectives (SLO). To better emulate real-world usage, we’ve also added a feature to specify the percent of CPU capacity you would like to utilize per host.ĬPU Use Case 1: Verify Your Monitoring and AlertingĪn important part of running and maintaining reliable infrastructure is ensuring teams are aware when something is out of the ordinary. The previous implementation required you to know or guess how many cores you have for each target. With Gremlin’s most recent release, we’ve added the ability to easily impact all of the available cores on your targets at once when using the CPU attack. It’s important to carefully consider the blast radius of any attack you’re running. In this post, we’ll cover the improvements along with some use cases for these Gremlin attacks, enabling you to ship more reliable code.Īs always, before running an attack, select the minimum number of hosts or containers and options necessary for you to learn without causing harm. These attacks provide stresses on your infrastructure, highlighting application weaknesses and bugs that lead to incidents or outages, creating a poor user experience. Infrastructure attacks (Resource, State, and Network attacks) are at the core of Gremlin’s functionality. We’ve recently made upgrades to our CPU, disk, and memory attacks to provide more configurability, improve reliability, and enhance ease of use. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |