AppDynamics is a application performance management solution that helps solve some of biggest challenges with distributed or multi-tier applications - application monitoring, troubleshooting, and performance analysis. Having spent a fair bit of time as a field engineer with Microsoft, I am well aware of some of these challenges. Oftentimes you get to a customer site and it ends up being a missing index of the database which has been slowing everything down. Other times you might notice that challenges within the network are slowing entire suites of applications. In other cases, a simple code change can introduce a severe performance problem and finding the source can be a daunting challenge when the application is distributed and don't know the first place to start.
Cloud apps on Windows Azure are usually Multi-Tiered Apps
Windows Azure makes it easy to write cloud hosted applications. By design, these applications tend to be highly distributed with multiple tiers. Troubleshooting poor performance means knowing exactly which node or tier is failing. Distributed applications do not have a single point of failure. Everything here and every single node on every tier suspect - from the operating system, to middleware, to third-party products, hardware, network configuration, and your own software logic. Each node typically performs its own I/O, has its own ethernet cards, and contains various running processes and threads.
As the complexities of applications grows, trying to find the slow pieces becomes increasingly more difficult. It involves much more than just looking at log files for the OS or for the web server. Logging is only useful if engineers had the forethought to log the error condition.
Cloud computing typically adds a layer of abstraction to your applications. For starters, you don’t have physical access the hardware. This limits you from having to physical interaction that specific pieces of the architecture. It also limits your conceptual understanding of all the moving pieces - you will need to create the diagram yourself. Many cloud-based applications leverage third-party services, such as identity or caching. These pieces often get overlooked when troubleshooting, because they’re not a big part of an application’s code base.
Application Performance Monitoring (APM) step up to solve the problem by looking at the inner workings of your cloud based applications. These tools can see the code executing, the entry and exit calls to the application, the transactions flowing through and across multiple application components.
High Level Perspective is needed
What is really needed is a visual overview of the entire application. An application flow map allows you to understand all of the dependencies of a distributed application. The ability to drill down into connection points an individual nodes is critical. Oftentimes there are customers that are being affected, so speedy resolution is crucial. Having a bird’s eye view of all the moving pieces is what leads to efficient troubleshooting and faster resolution of production problems.
A better approach is to actually anticipate problems before they occur. That is, the goal should be moved more of a pro-active monitoring approach, as opposed to a reactive approach. This means finding and fixing performance problems immediately before they become critical. A great place to start is building an application flow map, which shows all the pieces in the application and how the data travels among those pieces. It also means building a baseline of expected performance given specific load thresholds, and then monitoring the performance over time. This includes measuring browser response time, which is ultimately the way your customers will evaluate the performance of your cloud-based application. If performance objectives are not met, it is important for network administrators, developers, and business stakeholders to get some numbers and facts early on and head off any future problems. AppsDynamics can set alerts whenever baseline conditions have deviated.
An Example - 3 Tiers (Web Front End, Service Bus, Background Process/Worker Role)
So what I did to get started was walk through one of the Azure labs. The lab that I wanted test was this one:
The purpose of the lab is to set up a multitier application that leverages Windows Azure service bus queues. As most architects know, queues make it possible to architect decoupled applications. Queues are powerful because it allows for client applications to submit messages at a high rate of speed, one that may exceed the ability of the backend server to process. As the queue size begins to grow, more Windows Azure worker roles can be added to increase scale and therefore process messages in a timely fashion. This is also a core building block for publish subscribe patterns.
3 Tier Diagram
The way figure 1 works is simple. Client applications connect to the web role and send messages. The web role takes these messages and then places them into the queue, which run as a Windows Azure Service Bus application. The worker role can be thought of as a background process. It checks the queue in the service bus and reads messages from the queue. If the queue starts getting too large, then we need to scale the worker roles to be able to handle the volume of messages that are getting placed into the service bus queue.
Note: The Web Role can act as both a proxy for other apps. It also includes a web page front end. This means that the web role is really 2 tiers in this example.
Figure 1 - Conceptual Diagram of Multi-Tiered Application
Where is the bottleneck?
The diagram above you can notice that the web role can act as a proxy to client applications that wish to submit a message to the worker roles. However, if this application starts to perform badly, troubleshooting can be a challenge. Identifying the performance bottleneck can be tricky. For example it might be happening as the web role submits messages to the service bus. Or it could be that the worker role cannot read quickly enough from the service bus due to the throughput and size of the message is getting submitted. Moreover, imagine that a database is involved, further obscuring the exact source of errant application behavior.
The ideal world
In the ideal world a developer would be able to publish their application into the cloud and be able to see each of these tears as a node in a diagram that a web portal. Well, that’s exactly what AppDynamics lets you do.
AppDynamics automatically provides a useful diagram
Notice that the view provided by AppDynamics (Figure 2) is almost identical to the conceptual diagram. The beauty of all this it is that it is done automatically by AppDynamics when you publish your application. As her application grows or scales, AppDynamics will maintain the diagram for you. AppDynamics will also allow you to automatically capture benchmarks to be used for future diagnostics.
The view in Figure 2 is what can be seen at the AppDynamics portal. This is not a diagram that a developer or architect constructed. This is a diagram built automatically by AppDynamics using the agents and framework elements that are part of AppDynamics. When you deploy an Azure Project, this MSI executes: dotNetAgentSetup64.msi.
Figure 2 - AppDynamics View - Automatically Generated