-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use md5 hash in file name for cache busting instead of the timestamp query parameter #12
Comments
Hi, thank you for the question. Do you have an example when it would be useful and what issues do we have now with the current approach? I think keeping the original file name is more user friendly approach than just generation some random string. To address your question, you can define a version name function which allows you to pick the filename by yourself. |
Certain browsers don't take the query string into account. Also certain proxies will ignore them when it comes to caching. In both cases that means the cache hit never happens. Using query strings for versioning files has been openly not recommended for over 10 years. If you Google around for the topic you'll find dozens or more in depth answers on why not. That's why Phoenix, Rails, Django and other frameworks all use md5 hashes for their cache busting strategy. A think a reasonable work flow would be to hash the file on upload and then add the md5 hash as part of an attribute that gets saved to the database for that file so that it can be retrieved at run time. I didn't think the implementation through 100% but I don't know if there's any other way? The md5 hash would need to be stored in the database, otherwise there would be no way to know it later. It's to my understanding we're doing a DB lookup at run time as is to look at the updated time stamp, so it shouldn't cause any change in run time performance. Can you give a code example on how defining a version name function will work in practice? For example, here's my current implementation for handling a user photo. defmodule Hello.Files.File do
defmacro __using__(types) do
quote do
use Waffle.Definition
use Waffle.Ecto.Definition
def __storage, do: Waffle.Storage.Local
def default_url(_version, _scope), do: "/images/defaults/anonymous.png"
def expanded_dir(path) do
Application.fetch_env!(:waffle, :storage_dir) <> path
end
def validate({file, _}) do
Enum.member?(unquote(types), file_type(file))
end
def mime_type(file) do
case file_type(file) do
:jpg -> "image/jpg"
:jpeg -> "image/jpg"
:png -> "image/png"
end
end
defp file_type(file) do
file.file_name
|> Path.extname()
|> String.replace(".", "")
|> String.downcase()
|> String.to_atom()
end
defp hashed(id), do: Hello.Hashid.encode(id)
defp source(scope), do: scope.__meta__.source
end
end
end defmodule Hello.Files.Profile do
use Hello.Files.File, [:jpg, :jpeg, :png]
@versions [:large, :medium, :small, :tiny]
def storage_dir(_version, {_file, scope}),
do: expanded_dir("/photos/#{source(scope)}/#{hashed(scope.id)}")
def filename(version, _), do: "photo-#{version}"
def transform(version, _scope) do
{:convert, "-strip -resize #{dimensions(version)} -format png", :png}
end
defp dimensions(:large), do: "600x600"
defp dimensions(:medium), do: "300x300"
defp dimensions(:small), do: "150x150"
defp dimensions(:tiny), do: "40x40"
end This creates a |
Is this something still up for discussion? If it won't be added to the library itself, what would the steps look like to implement something like the above for a specific project? I've kind of stepped away from the project I was working on for ~6 months, but I'm thinking about resuming it and this feature is kind of critical to the app. Not really sure how to move forward. |
I think the most simple way to implement this is to define custom filename
for uploaded files.
You can generate random string or take md5 hash of file's content.
Then you can completely omit the version attribute in url because the
filename will be different on each upload.
Define `filename` function inside your uploader
```elixir
def filename(version, {file, scope}) do
md5_name = get_md5(file.path)
"#{md5_name}_#{version}"
end
```
Probably, we should modify `waffle_ecto` to allow removing version from
generated urls.
|
Hi,
Have you given any thoughts on using an md5 hash to uniquely identify versions? This is a best practice when it comes to cache busting vs using query parameters.
I know this introduces a fair bit more complexity but it seems like a good long term solution. Phoenix Digest has all of the code related to pulling this off as well.
The text was updated successfully, but these errors were encountered: