-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: cobalt v10 api - parse filename | fix #27 #28
Conversation
filename in headers "content-disposition"
WalkthroughThe pull request introduces significant changes across several modules, primarily focusing on the file downloading functionality. A new module, Changes
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (11)
lib/small_sdk/cobalt.ex (3)
19-25
: Refactor duplicated error message.The error message "Can't get download url using Cobalt API" is duplicated in both error cases. Consider extracting it to a module attribute for better maintainability.
defmodule SmallSdk.Cobalt do require Logger + @download_url_error "Can't get download url using Cobalt API" # ... rest of the code ... {:ok, _} -> Logger.warning("cobalt response: #{inspect(res)}") - {:error, "Can't get download url using Cobalt API"} + {:error, @download_url_error} {:error, msg} -> Logger.error("cobalt error: #{msg}") - {:error, "Can't get download url using Cobalt API"} + {:error, @download_url_error}
72-72
: Ensure consistent error handling.The error logging here follows the same pattern of logging the entire response body. Once the logging strategy is updated in other functions, ensure this function is updated consistently.
Line range hint
1-76
: Consider enhancing error handling and separation of concerns.The module could benefit from:
- Separating API interaction logic from response processing into distinct modules.
- Adding retry mechanisms for transient network failures.
- Implementing circuit breakers for API calls to handle downstream failures gracefully.
These improvements would make the system more resilient to failures and easier to maintain.
lib/save_it/google_drive.ex (3)
3-7
: Consider using English for documentation consistency.While the TODOs provide valuable context about future improvements, using non-English comments might affect code maintainability in an open-source context. Consider translating these items to English.
Here's a suggested translation:
- TODO:重要: - 改善 - - [ ] 1. 如果没有配置,直接 skip - - [ ] 移动至 small_sdk + TODO: Important: + Improvements: + - [ ] 1. Skip if not configured + - [ ] Move to small_sdk
132-139
: Enhance error logging with operation context.While the error handling is solid, the generic message "Failed at Google Drive" could be more helpful for debugging. Consider including the operation name in the error messages.
Here's a suggested improvement:
- Logger.warning("Failed at Google Drive, status: #{status}, body: #{inspect(body)}") + Logger.warning("Google Drive operation failed, status: #{status}, body: #{inspect(body)}", operation: :handle_response) - Logger.error("Failed at Google Drive, reason: #{inspect(reason)}") + Logger.error("Google Drive operation failed", operation: :handle_response, reason: inspect(reason))
Add configuration validation to prevent runtime errors
The review comment is valid. The current implementation has several issues with configuration handling:
FileHelper.get_google_access_token
andFileHelper.get_google_drive_folder_id
returnnil
when configurations are missing, but thisnil
value is used directly in API calls which will cause runtime errors.- The module lacks upfront validation of required configurations before attempting operations.
Suggested changes:
- Add configuration validation at the start of each public function
- Return early with meaningful error when access token or folder ID is missing
- Consider implementing a
validate_config/1
helper functionExample locations needing attention:
list_files/1
: Line 121upload_file_content/3
: Line 141upload_files/2
: Line 147upload_file/2
: Line 182🔗 Analysis chain
Line range hint
1-141
: Verify configuration handling before operations.Given the TODO about skipping when not configured, it would be beneficial to add explicit configuration checks before performing operations. This could prevent runtime errors when Google Drive credentials are missing.
Let's check how configuration is handled across the codebase:
Let me check the content of the configuration files and the FileHelper implementation to better understand the configuration handling:
Let me check the complete implementation of FileHelper to see how it handles missing configurations:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for configuration handling patterns echo "Searching for configuration checks..." rg "get_google_access_token|get_google_drive_folder_id" -A 3 echo "Searching for related configuration files..." fd "config.exs|runtime.exs"Length of output: 2218
Script:
#!/bin/bash # Check configuration files content echo "Content of config/config.exs:" cat config/config.exs echo -e "\nContent of config/runtime.exs:" cat config/runtime.exs echo -e "\nChecking for error handling in FileHelper:" rg -A 10 "defmodule.*FileHelper"Length of output: 1599
Script:
#!/bin/bash # Get the complete FileHelper implementation echo "Complete FileHelper implementation:" cat lib/save_it/file_helper.ex # Check for any error handling in the main module echo -e "\nChecking error handling in GoogleDrive module:" rg "handle_response|error" lib/save_it/google_drive.ex -A 5Length of output: 4568
lib/save_it/bot.ex (2)
Line range hint
274-302
: Maintain consistent error handling with download_files.For consistency with the previous suggestion, enhance the error handling here as well.
Apply similar error handling improvements:
- case WebDownloader.download_file(download_url) do - {:ok, file_name, file_content} -> + case WebDownloader.download_file(download_url) do + {:ok, file_name, file_content} -> # ... existing success handling ... - _ -> + {:error, reason} -> + Logger.error("Failed to download file: #{inspect(reason)}") update_message( chat.id, progress_message.message_id, "💔 Failed downloading file." ) end
303-304
: Improve error handling for Cobalt download URL failures.The current implementation discards specific error information, which could make debugging harder.
Consider logging the specific error:
- {:error, _} -> + {:error, reason} -> + Logger.error("Failed to get download URL: #{inspect(reason)}") update_message(chat.id, progress_message.message_id, "💔 Failed to get download URL.")lib/small_sdk/web_downloader.ex (3)
4-4
: Address the 'FIXME' to finalize the return valueThere's a
# FIXME
comment indicating the need to adjust the return value ofdownload_files/1
to{:ok, file_name, file_content} | {:error, reason}
. It's important to ensure that the function returns the correct format as specified.Do you need assistance implementing this change or creating a GitHub issue to track it?
19-19
: Handle Stream data indownload_file/1
functionThere's a
# TODO
comment indicating that Stream data handling needs to be implemented. To ensure the function can handle streaming responses, consider updating the implementation accordingly.Do you need help implementing Stream data handling or creating a GitHub issue to track this task?
26-26
: Standardize error messages without emojisThe error messages include a heartbroken emoji
💔
, which might not be appropriate for all consumers of this module, especially in logs or APIs consumed by other services.Consider removing the emoji for professional and consistent error handling.
Apply this diff:
- {:error, "💔 Downloaded an empty file"} + {:error, "Downloaded an empty file"} - {:error, "💔 Failed to download file"} + {:error, "Failed to download file"}Also applies to: 44-44, 49-49
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
lib/save_it/bot.ex
(4 hunks)lib/save_it/file_helper.ex
(0 hunks)lib/save_it/google_drive.ex
(2 hunks)lib/small_sdk/cobalt.ex
(3 hunks)lib/small_sdk/typesense.ex
(1 hunks)lib/small_sdk/web_downloader.ex
(1 hunks)
💤 Files with no reviewable changes (1)
- lib/save_it/file_helper.ex
✅ Files skipped from review due to trivial changes (1)
- lib/small_sdk/typesense.ex
🔇 Additional comments (5)
lib/small_sdk/cobalt.ex (2)
57-57
: Review error message content.
The error message includes the entire response body which might expose sensitive information. Consider logging only necessary fields or redacting sensitive data.
20-20
: Review logging of sensitive information.
The warning log includes the entire response object which might contain sensitive data. Consider logging only necessary information or redacting sensitive fields.
✅ Verification successful
Let me gather more context about the Cobalt response structure and logging patterns.
Let me check the Cobalt module's structure and response handling to understand what kind of data it processes.
Let me check one final aspect about the Cobalt response handling and logging patterns.
Based on the complete context, I can now provide the final response:
Logging of response is appropriate in this context
The warning log of the Cobalt response occurs only in an error path when the response doesn't match the expected format ({:ok, %{"url" => _}}
or {:ok, %{"status" => "picker", ...}}
). This is appropriate because:
- It helps debug unexpected response formats from the Cobalt API
- The Cobalt API is used for URL processing and doesn't handle sensitive data (only processes public URLs)
- The logging pattern is consistent with other error handling in the codebase where responses are logged for debugging
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other instances of response logging to ensure consistent handling
rg -g '*.{ex,exs}' 'inspect\(.*res.*\)'
Length of output: 1038
Script:
#!/bin/bash
# Look for the Cobalt response structure and handling
ast-grep --pattern 'def handle_response($$$)' lib/small_sdk/cobalt.ex
# Check for any sensitive data handling patterns
rg -g '*.{ex,exs}' -B2 -A2 'sensitive|secret|password|token|key'
Length of output: 11550
Script:
#!/bin/bash
# Look for the Cobalt module's request and response handling
rg -g 'cobalt.ex' -B5 -A5 'def (get|post|put|delete|request)'
# Check for any struct or type definitions that might indicate response structure
ast-grep --pattern 'defstruct' lib/small_sdk/cobalt.ex
Length of output: 640
Script:
#!/bin/bash
# Look for the complete Cobalt module content to understand the response structure
cat lib/small_sdk/cobalt.ex
# Check for any response logging patterns in the application
rg -g '*.{ex,exs}' -B2 -A2 'Logger\.(warning|error|info|debug)'
Length of output: 10303
lib/save_it/google_drive.ex (1)
124-131
: LGTM! Clean and focused response handling.
The simplified pattern matching improves code clarity while maintaining proper handling of both list and single-item responses.
lib/save_it/bot.ex (2)
12-12
: LGTM: New WebDownloader alias.
The alias is correctly placed with other SmallSdk aliases, supporting the transition to the new download implementation.
Line range hint 236-273
: Consider enhancing error handling for better debugging.
While the implementation is functionally correct, the error handling could be improved to provide more context about download failures.
Consider modifying the error handling to log the specific error:
- case WebDownloader.download_files(download_urls) do
- {:ok, files} ->
+ case WebDownloader.download_files(download_urls) do
+ {:ok, files} ->
# ... existing success handling ...
- _ ->
+ {:error, reason} ->
+ Logger.error("Failed to download files: #{inspect(reason)}")
update_message(
chat.id,
progress_message.message_id,
"💔 Failed downloading file."
)
end
Additionally, let's verify the WebDownloader implementation:
def download_file(url) do | ||
Logger.info("download_file started, url: #{url}") | ||
|
||
case Req.get(url) do | ||
{:ok, %{status: status, body: ""}} -> | ||
Logger.warning("Downloaded an empty file, status: #{status}") | ||
{:error, "💔 Downloaded an empty file"} | ||
|
||
{:ok, %{status: status, body: body, headers: headers}} -> | ||
case status do | ||
status when status in 200..209 -> | ||
filename = | ||
cond do | ||
String.contains?(url, "/tunnel") -> | ||
parse_filename(url, :content_disposition, headers) | ||
|
||
true -> | ||
parse_filename(url, :content_type, headers) | ||
end | ||
|
||
{:ok, filename, body} | ||
|
||
_ -> | ||
Logger.error("download_file failed, status: #{status}, body: #{inspect(body)}") | ||
{:error, "💔 Failed to download file"} | ||
end | ||
|
||
{:error, reason} -> | ||
Logger.error("download_file failed, reason: #{inspect(reason)}") | ||
{:error, "💔 Failed to download file"} | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate URLs to prevent potential security risks
The function download_file/1
accepts URLs and performs HTTP GET requests without validation. This could expose the system to security risks such as Server-Side Request Forgery (SSRF).
Consider adding validation to ensure that URLs are safe and conform to expected patterns, or limit requests to trusted domains.
defp parse_filename(url, :content_type, headers) do | ||
ext = | ||
headers | ||
|> Map.get("content-type") | ||
|> List.first() | ||
|> String.split("/") | ||
|> List.last() | ||
|
||
gen_file_name(url) <> "." <> ext | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for missing or malformed 'content-type' headers
The function assumes that the 'content-type' header exists and is properly formatted. If the header is missing or has an unexpected format, it could lead to runtime errors.
Consider adding safeguards to handle such cases safely:
ext =
headers
|> Enum.find_value(fn {key, value} -> if key == "content-type", do: value end)
|> case do
nil -> "bin"
content_type ->
content_type
|> String.split("/")
|> List.last()
end
Ensure correct extraction of 'content-type' from headers
In parse_filename/3
when handling :content_type
, the headers
are being accessed using Map.get/2
, but headers
is likely a list of tuples, not a map. Using Map.get/2
on a list will not retrieve the header correctly.
To fix this issue, consider using List.keyfind/3
or Enum.find/2
to access headers properly.
Apply this diff:
- headers
- |> Map.get("content-type")
+ headers
+ |> Enum.find_value(fn {key, value} -> if key == "content-type", do: value end)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
defp parse_filename(url, :content_type, headers) do | |
ext = | |
headers | |
|> Map.get("content-type") | |
|> List.first() | |
|> String.split("/") | |
|> List.last() | |
gen_file_name(url) <> "." <> ext | |
end | |
defp parse_filename(url, :content_type, headers) do | |
ext = | |
headers | |
|> Enum.find_value(fn {key, value} -> if key == "content-type", do: value end) | |
|> List.first() | |
|> String.split("/") | |
|> List.last() | |
gen_file_name(url) <> "." <> ext | |
end |
defp parse_filename(_url, :content_disposition, headers) do | ||
filename = | ||
headers | ||
|> Map.get("content-disposition") | ||
|> List.first() | ||
|> String.split(";") | ||
|> Enum.find(fn x -> String.contains?(x, "filename") end) | ||
|> String.split("=") | ||
|> List.last() | ||
|> String.trim() | ||
|> String.replace("\"", "") | ||
|
||
filename | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle potential errors when parsing 'content-disposition' header
The function assumes that the 'content-disposition' header exists and follows a specific format. If the header is missing or malformed, it could cause runtime errors.
Add checks or default values to handle cases where the filename cannot be extracted:
filename =
headers
|> Enum.find_value(fn {key, value} -> if key == "content-disposition", do: value end)
|> case do
nil -> gen_file_name(url)
content_disposition ->
content_disposition
|> String.split(";")
|> Enum.find(fn x -> String.contains?(x, "filename") end)
|> case do
nil -> gen_file_name(url)
filename_part ->
filename_part
|> String.split("=")
|> List.last()
|> String.trim("\"")
end
end
Correct the retrieval of 'content-disposition' header
In parse_filename/3
when handling :content_disposition
, headers
is being accessed with Map.get/2
, but headers
is likely a list of tuples. This will not work as intended.
Update the code to properly extract the 'content-disposition' header:
- headers
- |> Map.get("content-disposition")
+ headers
+ |> Enum.find_value(fn {key, value} -> if key == "content-disposition", do: value end)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
defp parse_filename(_url, :content_disposition, headers) do | |
filename = | |
headers | |
|> Map.get("content-disposition") | |
|> List.first() | |
|> String.split(";") | |
|> Enum.find(fn x -> String.contains?(x, "filename") end) | |
|> String.split("=") | |
|> List.last() | |
|> String.trim() | |
|> String.replace("\"", "") | |
filename | |
end | |
defp parse_filename(_url, :content_disposition, headers) do | |
filename = | |
headers | |
|> Enum.find_value(fn {key, value} -> if key == "content-disposition", do: value end) | |
|> List.first() | |
|> String.split(";") | |
|> Enum.find(fn x -> String.contains?(x, "filename") end) | |
|> String.split("=") | |
|> List.last() | |
|> String.trim() | |
|> String.replace("\"", "") | |
filename | |
end |
def download_files(urls) do | ||
Logger.info("download_files started, urls: #{inspect(urls)}") | ||
|
||
res = | ||
urls | ||
|> Enum.map(&download_file/1) | ||
|> Enum.reduce_while([], fn | ||
{:ok, filename, file_content}, acc -> {:cont, [{filename, file_content} | acc]} | ||
{:error, reason}, _ -> {:halt, {:error, reason}} | ||
end) | ||
|
||
{:ok, res} | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure download_files/1
returns consistent success and error tuples
Currently, download_files/1
may return {:ok, {:error, reason}}
when an error occurs during file downloads, which can be confusing. The function should return {:error, reason}
in case of an error to maintain consistent return types.
Apply this diff to adjust the return value:
def download_files(urls) do
Logger.info("download_files started, urls: #{inspect(urls)}")
res =
urls
|> Enum.map(&download_file/1)
|> Enum.reduce_while([], fn
{:ok, filename, file_content}, acc -> {:cont, [{filename, file_content} | acc]}
{:error, reason}, _ -> {:halt, {:error, reason}}
end)
- {:ok, res}
+ case res do
+ {:error, reason} -> {:error, reason}
+ files -> {:ok, files}
+ end
end
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def download_files(urls) do | |
Logger.info("download_files started, urls: #{inspect(urls)}") | |
res = | |
urls | |
|> Enum.map(&download_file/1) | |
|> Enum.reduce_while([], fn | |
{:ok, filename, file_content}, acc -> {:cont, [{filename, file_content} | acc]} | |
{:error, reason}, _ -> {:halt, {:error, reason}} | |
end) | |
{:ok, res} | |
end | |
def download_files(urls) do | |
Logger.info("download_files started, urls: #{inspect(urls)}") | |
res = | |
urls | |
|> Enum.map(&download_file/1) | |
|> Enum.reduce_while([], fn | |
{:ok, filename, file_content}, acc -> {:cont, [{filename, file_content} | acc]} | |
{:error, reason}, _ -> {:halt, {:error, reason}} | |
end) | |
case res do | |
{:error, reason} -> {:error, reason} | |
files -> {:ok, files} | |
end | |
end |
Summary by CodeRabbit
Release Notes
New Features
WebDownloader
module for improved file downloading capabilities.Bug Fixes
Documentation
GoogleDrive
module, including additional TODO items for future enhancements.These changes enhance the overall functionality and reliability of file downloading within the application.