Updated URLScan for archive URLs. Fixes #84 #94

RajuKoushik · 2017-08-23T19:40:36Z

Signed-off-by: rajukoushik g.rajukoushik@gmail.com

Signed-off-by: rajukoushik <g.rajukoushik@gmail.com>

JonoYang

Sorry for the delay. I made a few comments regarding your code. Thanks for the contribution!

JonoYang · 2017-09-27T01:15:13Z

scanapp/tasks.py

+    Create and save a file at `path` present at `url` using `scan_id` and bare `path` and
+    `file_name` and apply the scan.
+    """
+    r = requests.get(url)


Why is path repeated here?

JonoYang · 2017-09-27T01:21:44Z

scanapp/tasks.py

+    url_parse = urlparse(url)
+    os.chdir(path)
+
+    if r.status_code == 200:


We have code from extractcode in scancode that can do extractions on different archive types which may be helpful, for example https://github.com/nexB/scancode-toolkit/blob/develop/src/extractcode/extract.py#L101

JonoYang · 2017-09-27T01:23:19Z

scanapp/views.py

+                scan_directory = None
+                scan_id = create_scan_id(user, url, scan_directory, scan_start_time)
+                current_scan = Scan.objects.get(pk=scan_id)
+                path = '/'.join([path, '{}'.format(current_scan.pk)])


Use os.path.join() to ensure consistency when joining paths

JonoYang · 2017-09-27T01:48:13Z

scanapp/views.py

+                for i in allowed_exts:
+                    if url_parse.path.endswith(i):
+                        is_zip_url = True
+            finally:


Why is try-finally used?

JonoYang · 2017-09-27T02:05:44Z

scanapp/views.py

+            is_zip_url = False
+
+            try:
+                for i in allowed_exts:


We may have some code that identify whether or not files are archives or not. I will ask @pombredanne

Updated URLScan for archive URLs. Fixes aboutcode-org#84

26b48a1

Signed-off-by: rajukoushik <g.rajukoushik@gmail.com>

JonoYang suggested changes Sep 27, 2017

View reviewed changes

singh1114 mentioned this pull request Jan 31, 2018

Extending the scancode API #95

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updated URLScan for archive URLs. Fixes #84 #94

Updated URLScan for archive URLs. Fixes #84 #94

Uh oh!

RajuKoushik commented Aug 23, 2017

Uh oh!

JonoYang left a comment

Uh oh!

JonoYang Sep 27, 2017

Uh oh!

JonoYang Sep 27, 2017

Uh oh!

JonoYang Sep 27, 2017

Uh oh!

JonoYang Sep 27, 2017

Uh oh!

JonoYang Sep 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Updated URLScan for archive URLs. Fixes #84 #94

Are you sure you want to change the base?

Updated URLScan for archive URLs. Fixes #84 #94

Uh oh!

Conversation

RajuKoushik commented Aug 23, 2017

Uh oh!

JonoYang left a comment

Choose a reason for hiding this comment

Uh oh!

JonoYang Sep 27, 2017

Choose a reason for hiding this comment

Uh oh!

JonoYang Sep 27, 2017

Choose a reason for hiding this comment

Uh oh!

JonoYang Sep 27, 2017

Choose a reason for hiding this comment

Uh oh!

JonoYang Sep 27, 2017

Choose a reason for hiding this comment

Uh oh!

JonoYang Sep 27, 2017

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants