Archive and Analysis with Amazon S3 and Glacier: Introduction
Logging is an essential part of any system. It let's you understand
what's going on in your system especially serving as a vital source for
debugging. Primarily many systems uses logging to let developers debug
issues in the production environment. But there are systems where
logging becomes the essential component to understand the following
- User Behavior - understanding user behavior patterns such as which areas of the system is being used by the user
- Feature Adoption - evaluate new feature adoption by tracking how a new feature is being used by the users. Do they vanish after a particular step in a particular flow? Are people from a specific geography use this during a specific time of the day?
- Click through analysis - let's say you are placing relevant ads across different pages in your websites. You would like to know how many users clicked them, the demographic analysis and such
- System performance
- Any abnormal behavior in certain areas in the system - a particular step in a workflow resulting in error/exception conditions
- Analyzing performance of different areas in the system - such as finding out if a particular screen takes more time to load because of a longer query getting executed. Should we optimize the database? Should we probably introduce a caching layer?
- Under performance of the system - the system could be spending more resources in logging than actively serving requests
- Huge log files - generally log files grow very fast, especially when inappropriate log levels are used such as "debug" levels for all log statements
- Inadequate data - if the log contains only debug information by the developer there will not be much of an analysis that can be performed
- Local Storage - how do you efficiently store the log files on the local server without running out of disk space; especially when log files tend to grow
- Central Log Storage - how do you centrally store log files so that it can be used later for analysis
- Dynamic Server Environment - how do you make sure you collect & store all the log files in a dynamic server environment where servers will be provisioned and de-provisioned on demand depending upon load
- Multi source - handling log files from different sources - like your web servers, search servers, Content Distribution Network logs, etc...
- Cost effective - when your application grows, so does your log files. How do you store the log files in the most cost effective manner without burning a lot of cash
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)




