-
Notifications
You must be signed in to change notification settings - Fork 70
Description
Summary
So I've been playing with skew for a week or so now and whilst it principally works for what I want it to do at a functional level, I've found that something is causing it to handle blocking IO very poorly (not at all in fact)
The tests I've been running are to scan my own AWS account for S3 buckets. Skew works perfectly for finding the 3 buckets I expected it to, but then never returns.
At first I thought I was being impatient, so I decided to leave it run overnight and through the morning. The function still never returned. I understand that scanning all of AWS is a potentially non-trivial task, but more than 16 hours? Something must be up.
Investigation
The first thing I noticed trying to work out what was going on here was (when it worked) the dump from the keyboard interrupt when running in the console:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ares/.local/lib/python3.8/site-packages/skew/arn/__init__.py", line 323, in __iter__
for scheme in self.scheme.enumerate(context, **self.kwargs):
File "/home/ares/.local/lib/python3.8/site-packages/skew/arn/__init__.py", line 247, in enumerate
for provider in self._arn.provider.enumerate(
File "/home/ares/.local/lib/python3.8/site-packages/skew/arn/__init__.py", line 232, in enumerate
for service in self._arn.service.enumerate(
File "/home/ares/.local/lib/python3.8/site-packages/skew/arn/__init__.py", line 217, in enumerate
for region in self._arn.region.enumerate(
File "/home/ares/.local/lib/python3.8/site-packages/skew/arn/__init__.py", line 198, in enumerate
for account in self._arn.account.enumerate(
File "/home/ares/.local/lib/python3.8/site-packages/skew/arn/__init__.py", line 145, in enumerate
for resource in self._arn.resource.enumerate(
File "/home/ares/.local/lib/python3.8/site-packages/skew/arn/__init__.py", line 127, in enumerate
resources.extend(resource_cls.enumerate(
File "/home/ares/.local/lib/python3.8/site-packages/skew/resources/aws/s3.py", line 27, in enumerate
resources = super(Bucket, cls).enumerate(arn, region, account,
File "/home/ares/.local/lib/python3.8/site-packages/skew/resources/resource.py", line 54, in enumerate
data = client.call(enum_op, query=path, **kwargs)
File "/home/ares/.local/lib/python3.8/site-packages/skew/awsclient.py", line 127, in call
data = op(**kwargs)
File "/home/ares/.local/lib/python3.8/site-packages/botocore/client.py", line 386, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/ares/.local/lib/python3.8/site-packages/botocore/client.py", line 691, in _make_api_call
http, parsed_response = self._make_request(
File "/home/ares/.local/lib/python3.8/site-packages/botocore/client.py", line 711, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/home/ares/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 102, in make_request
return self._send_request(request_dict, operation_model)
File "/home/ares/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 134, in _send_request success_response, exception = self._get_response(
File "/home/ares/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 166, in _get_response success_response, exception = self._do_get_response(
File "/home/ares/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 200, in _do_get_response
http_response = self._send(request)
File "/home/ares/.local/lib/python3.8/site-packages/botocore/endpoint.py", line 269, in _send
return self.http_session.send(request)
File "/home/ares/.local/lib/python3.8/site-packages/botocore/httpsession.py", line 344, in send
urllib_response = conn.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 421, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 416, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.8/http/client.py", line 1344, in getresponse
response.begin()
File "/usr/lib/python3.8/http/client.py", line 307, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.8/http/client.py", line 268, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.8/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
KeyboardInterrupt
>>>
Diving into the codebase skew wraps around boto3 which in turn uses (to no massive surprise) urllib3 to manage its requests to AWS itself.
So cool, it's just urllib3 under the hood and it's timing out. Well urllib3 doesn't have a default timeout for requests, but you can certainly set one. So using socket.setdefaulttimeout(60) set the timeout to 60 seconds (pretty fair imo) and suddenly skew (or more specifically boto3) doesn't work at all. Even the original buckets that I expect to see listed there are not returned. What makes this slightly more confusing is that it doesn't actually return a timeout exception; I'm not even convinced the requests are made.
With messing with the timeout not working I thought, what if I can just force the thread to exit by force? So using various forms of inspiration from this thread, I tried forcing skew to exit early.
Context Manager
This context manager didn't work at all, skew just didn't stop
@contextmanager
def timeout(duration):
def timeout_handler(signum, frame):
raise BlockingIOError(f'Function timed out after {duration} seconds')
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(duration)
yield
signal.alarm(0)Thread Decorator
This thread decorator saw better results in that the quit_function call is actually made (the call to log.info is made) but again, skew doesn't exit gracefully and just hangs.
def quit_function(fn_name):
log.info(f'{fn_name} took too long')
thread.interrupt_main() # raises KeyboardInterrupt
def exit_after(s):
"""
use as decorator to exit process if
function takes longer than s seconds
"""
def outer(fn):
def inner(*args, **kwargs):
timer = threading.Timer(s, quit_function, args=[fn.__name__])
timer.start()
try:
result = fn(*args, **kwargs)
finally:
timer.cancel()
return result
return inner
return outerI'm honestly running out of ideas as to where to go looking for this one. My use case is actually to be able to run skew on a serverless function endpoint that I can just call, but these have well defined timeouts that I just can't seem to get skew to obey.
Any ideas?