Conversation
illicitonion
left a comment
There was a problem hiding this comment.
Looking great! I've only looked at go-cat so far, but will review more soon!
illicitonion
left a comment
There was a problem hiding this comment.
Added review comments for go-ls - also looking good, congrats! I didn't repeat comments which were also in go-cat, but several of them apply to both programs :)
I'll try to get to the others soon, but in the mean time, please do apply the fixes to all of your exercises where you think they apply!
illicitonion
left a comment
There was a problem hiding this comment.
This is looking really good, congratulations! (Particularly, the tests are looking really clean and readable!)
illicitonion
left a comment
There was a problem hiding this comment.
This is looking really great, nicely put together!
Codecov ReportBase: 63.21% // Head: 63.74% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1 +/- ##
==========================================
+ Coverage 63.21% 63.74% +0.53%
==========================================
Files 3 6 +3
Lines 193 331 +138
==========================================
+ Hits 122 211 +89
- Misses 56 99 +43
- Partials 15 21 +6
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
illicitonion
left a comment
There was a problem hiding this comment.
Nice finishing off!
One question separate from the individual line comments:
One of the extension requirements was "Don't let the same image URL be uploaded twice" - what do you think the rationale behind this is? I can see that you've currently implemented this as a uniqueness constraint on the URL field in the database... What's this achieving? If two users try to upload the same image with different titles or different alt texts, why are we preventing them from doing so? Are there cases where we may want to allow this? What do you think users are likely to do if they hit this?
| return i, nil | ||
| } | ||
|
|
||
| func ValidateAltText(title, altText string) error { |
There was a problem hiding this comment.
Can you talk through your thinking behind what makes "good" alt text? What are you testing here? What would you ideally test? What's easy to test? What's hard to test? What are the risks of going too far in either direction (either being too strict, or too lax)?
There was a problem hiding this comment.
So what I tried to do is to check for word similarity and allow if there's at least 1 word in the alt text that is similar to a word in the title.
I think it's very light checking which will maybe work 50% of the time.
What would we ideally test? Ideally we should test if title and alt text are related, if they can be used interchangeably.
What's easy to test? Word equality is easy, word similarity is harder but it's still easy
What's hard to test? Relation between words or text, meaning of words or text. Text can be similar or even equal but mean completely different things.
Being too strict could result in our solution rejecting valid option as it might not understand what humans do. A Solution that's too strict (possibly involving ML could also be computationally expensive).
Being too lax could result in images having irrelevant alt texts
There was a problem hiding this comment.
Ideally we should test if title and alt text are related, if they can be used interchangeably.
I'm not sure this is actually our ideal. While the title and alt-text probably are related, I can imagine them having very different forms.
I can imagine a title being quite brief and punchy: "The cutest cat" or "Cat makes a surprising friend", and the alt-text being much more verbose and descriptive: "A cat climbing up a tree hugging a smiling squirrel which has wrapped its tail around the cat".
These two fields are performing really different functions. There's not necessarily a correspondence between the word "cutest" or "surprising" in the title and any word in the alt-text - they have different purposes.
I'd say our ideal alt text is "an accurate and descriptive description of the contents of the image". But this is an incredibly difficult problem to solve! (In fact, if it was easy to solve, we wouldn't need alt-text, a computer could just generate it for us).
Keying off of the title makes a lot of sense as an easy source of reference data - good thinking to look there! But because it's just a vague reference source, and not an oracle of good data, I'd lean in your implementation towards being very not-strict, and tending towards allowing maybe-bad alt-text because it's hard to test for.
| }) | ||
| } | ||
|
|
||
| func TestValidateImage(t *testing.T) { |
There was a problem hiding this comment.
What are the pros and cons of using images from the internet for this kind of test? What could you do instead?
There was a problem hiding this comment.
the main problem is that the image is not going to be there forever, and we don't have control over availability. There are also other factors that could contribute to failing test such as lack of internet connection or firewall blocking the connection.
Pros: We don't need to write additional code to spin up a server and serve the image. We are also testing what the app will actually be doing - making an http request over the internet and not locally.
There was a problem hiding this comment.
One thing I'd definitely recommend is splitting off the HTTP parts of the analysis from the operating-on-bytes part of the analysis.
Things like identifying the format of the image, and the dimensions, you can trivially test with just checked in image files without needing HTTP at all.
Then the HTTP parts - I'd recommend trying to implement the smallest, simplest version of this test you could which doesn't hit the public internet, and comparing the two solutions. I agree that the code to spin up a server is additional, but I'd recommend looking at both and seeing how much additional, and what you think you lose in terms of test fidelity by testing locally.
There was a problem hiding this comment.
Thanks, that makes perfect sense and I have now added a few separated tests.
So I went with a uniqueness constraint as I wanted to try something different.. I think. Normally you can query the database and check if there's a record with the same URL and inform the user, but I thought this adds an additional read operation that can be avoided with a constraint. They both achieve the same result, but database constraint is slightly more difficult to change in the future and also it prevents us from adding different business logic, i.e. if the user should be promped to confirm they want to upload a new image even though url exists. If two different users can upload the same URL twice, then the database constraint would have to be updated so userID+URL is unique, but this again raises the complexity as you will have to apply migrations instead of just releasing new code. |
Makes sense - good reasoning! |
No description provided.