Skip to content

Return an explicit 500 if there's an exception generating metrics#85

Closed
FauxFaux wants to merge 1 commit intoprometheus:masterfrom
FauxFaux:master
Closed

Return an explicit 500 if there's an exception generating metrics#85
FauxFaux wants to merge 1 commit intoprometheus:masterfrom
FauxFaux:master

Conversation

@FauxFaux
Copy link
Copy Markdown

The current behaviour observed in #84 is pretty bad. This implements the simplest possible fix: if there's any exception of any kind generating metrics, log it as we are now, but change the response code to a 500.

send_error is used, which dumps a random bit of html into the response, which doesn't seem like a problem. send_response(500) could be used instead, to get a 500 with a blank body.

class MetricsHandler(BaseHTTPRequestHandler):
def do_GET(self):
try:
content = generate_latest(core.REGISTRY)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest removing the try and letting the exception propagate up.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That causes an empty body to be returned, not a 500. I felt that a 500 was more useful than just closing the tcp connection (although I suspect Prometheus would handle the case in the same way).

     def do_GET(self):
-        try:
-            content = generate_latest(core.REGISTRY)
-        except:
-            logging.exception('error generating response')
-            self.send_error(500, 'error generating metrics')
-            self.end_headers()
-            return
-
+        content = generate_latest(core.REGISTRY)
         self.send_response(200)
% curl -v http://localhost:23456/
*   Trying ::1...
* connect to ::1 port 23456 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 23456 (#0)
> GET / HTTP/1.1
> Host: localhost:23456
> User-Agent: curl/7.47.0
> Accept: */*
> 
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prometheus would handle it the same. If we're going to explicitly return a 500 we should also include the full exception in the response.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many security policies discourage returning the stack-trace (or any details about the actual problem) in the response from 500 errors by default. Maybe you could argue that people typically authenticate these metrics endpoints, so it's not a problem? I don't, so would rather the stack-trace wasn't returned. If you decide to put it in, it will just be stripped by our load balancers anyway, so I don't mind either way.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would generally expect metrics endpoints not to be exposed to the outside world, and even then for the information you can gleam from the errors to be minimal.

@brian-brazil
Copy link
Copy Markdown
Contributor

Obsoleted by #119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants