-
Notifications
You must be signed in to change notification settings - Fork 277
Logging in AppScale
Logging is one of those things that programmers take for granted. We rely on logging to bring important debugging and error info to the forefront, and that's no different in web applications. One particularly nice thing about working at the platform-as-a-service layer is that it can control your runtime, allowing it to automatically collect your logs and display them to you. In AppScale, our upcoming 1.7.0 release takes this a step further, allowing you to retrieve both application-level logs and system-level logs via both a web or command-line interface. This post details how AppScale implements this support for arbitrary Google App Engine applications, what the interfaces look like that users interact with, and how other services can leverage this support.
Let's start off by looking at how we can store and retrieve logs. Ideally, we need the following:
- A replicated database, with the ability to query based on the name of the service that stores logs or the hostname of the machine they came from.
- A caching layer in front of the database, to speed up repeated accesses to this data.
- A simple web programming framework, to enable users and applications to access the data.
- Authentication, to ensure that only the right users can see the data.
By no small coincidence, these are all things that the Google App Engine APIs offer! We can use NDB to store and retrieve our logs, which automatically leverages the the Memcache API to speed up accessing our log data. Programming Python 2.7 apps on Google App Engine is a breeze with webapp2, and we can use the Google App Engine Users API if we need authentication when viewing logs. We thus use webapp2 to expose a single route in our AppDashboard, at "/logs/upload", that accepts POST requests to store log data. That code performs the following task:
def post(self):
""" Saves logs records to the Datastore for later viewing. """
encoded_data = self.request.body
data = json.loads(encoded_data)
service_name = data['service_name']
host = data['host']
log_lines = data['logs']
# First, check to see if this service has been registered.
service = LoggedService.get_by_id(service_name)
if service is None:
service = LoggedService(id = service_name)
service.hosts = [host]
service.put()
else:
if host not in service.hosts:
service.hosts.append(host)
service.put()
# Next, add in each log line as an AppLogLine
for log_line_dict in log_lines:
the_time = int(log_line_dict['timestamp'])
reversed_time = (2**34 - the_time) * 1000000
key_name = service_name + host + str(reversed_time)
log_line = RequestLogLine.get_by_id(id = key_name)
if not log_line:
log_line = RequestLogLine(id = key_name)
log_line.service_name = service_name
log_line.host = host
app_log_line = AppLogLine()
app_log_line.message = log_line_dict['message']
app_log_line.level = log_line_dict['level']
app_log_line.timestamp = datetime.datetime.fromtimestamp(the_time)
app_log_line.put()
log_line.app_logs.append(app_log_line)
log_line.put()
Here, our POST request expects a JSON-encoded body that contains the name of the service posting one or more logs (e.g., "guestbook", "appcontroller"), the hostname of the machine posting the logs, and the actual logs themselves. We then store these logs with a key that decreases as time increases, so that we can do a "SELECT *" query to get our logs sorted from newest to oldest (pretty handy when viewing logs). So that's how we store logs in the Datastore - let's look at how our modified AppServers send them to the AppDashboard.
The Python AppServer makes it pretty simple to send all the logs for a given request all at once. In logservice.py, we can see that all the logs for a request get batched up together, so we just send them off to our AppDashboard as follows:
def _flush(self):
"""Internal version of flush() with no locking."""
logs = self.parse_logs()
appid = os.environ['APPLICATION_ID']
if appid in ['apichecker', 'appscaledashboard']:
return
formatted_logs = [{'timestamp' : log[0] / 1e6, 'level' : log[1],
'message' : log[2]} for log in logs]
payload = json.dumps({
'service_name' : appid,
'host' : self.get_my_public_ip(),
'logs' : formatted_logs
})
conn = httplib.HTTPSConnection(self.get_login_ip() + ":443")
headers = {'Content-Type' : 'application/json'}
conn.request('POST', '/logs/upload', payload, headers)
response = conn.getresponse()
self._clear()
So here, we skip sending the logs if we're in the API Checker app (which gets started before the AppDashboard and thus can't send logs to it) or the AppDashboard itself (which would create an infinite loop if it tried sending logs to itself, because that process causes logs to get generated). We then grab out the log level, message, and timestamp for each log, and package it up in the nice format described above. We ship that off to the AppDashboard and call it a day! This results in a single HTTPS connection per request, unless the buffer that the logs are stored in gets too large (in which case its one HTTPS connection per flush).
Shipping off logs in the Java AppServer happens in a very similar fashion as in the Python AppServer, with the main difference being that the Java AppServer doesn't buffer up all the log lines for a single web request. That means we have to send them off one at a time, which does not perform as nicely as the Python version, but is not the end of the world. The code looks pretty similar (in LocalLogService.java):
boo