Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Gzip_io.string_lwt #25

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions dune
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
(preprocess
(per_module
((pps lwt_ppx)
gzip_io
httpev
logstash
lwt_flag
Expand Down
25 changes: 25 additions & 0 deletions gzip_io.ml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,31 @@ let string s =
IO.nwrite out (Bytes.unsafe_of_string s); (* IO wrong type *)
IO.close_out out

let string_lwt ?(chunk_size = 3000) ?(yield = Lwt.pause) s =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've realized that Gzip_stream imposes a 1024-byte buffer size, so we should not give the illusion of control.
i.e. can use chunk_size = 4096 or whatever internally, so that there's no problem if buffer_size were changed in Gzip_stream, but not allow to pass ?chunk_size since it'll normally be ineffective.

let out = output (IO.output_string ()) in
let buff = Buffer.create chunk_size in
let len = String.length s in
let rec loop i =
if i >= len then (
(* Final flush of the buffer if there's any residue *)
if Buffer.length buff > 0 then IO.nwrite out (Buffer.to_bytes buff);
Lwt.return_unit)
else begin
let c = s.[i] in
Buffer.add_char buff c;
if Buffer.length buff < chunk_size then loop (i + 1)
else (
(* Buffer is full, write and clear it *)
IO.nwrite out (Buffer.to_bytes buff);
Buffer.clear buff;
(* Yield after processing a chunk *)
let%lwt () = yield () in
loop (i + 1))
end
in
let%lwt () = loop 0 in
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to process the input string char by char (slow!), accumulate the data to be compressed in a buffer, then allocate a string corresponding to the data (ouch!). It should be possible to write substrings (not allocated, but with offset + size) from the original string to the IO.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I updated the code in ba084d4 (#25)

Lwt.return @@ IO.close_out out

let to_string s =
let inp = input (IO.input_string s) in
let out = IO.output_string () in
Expand Down
8 changes: 5 additions & 3 deletions httpev.ml
Original file line number Diff line number Diff line change
Expand Up @@ -928,12 +928,14 @@ let send_reply c cout reply =
end
in
(* possibly apply encoding *)
let (hdrs,body) =
let%lwt (hdrs,body) =
(* TODO do not apply encoding to application/gzip *)
(* TODO gzip + chunked? *)
match body, code, c.req with
| `Body s, `Ok, Ready { encoding=Gzip; _ } when String.length s > 128 -> ("Content-Encoding", "gzip")::hdrs, `Body (Gzip_io.string s)
| _ -> hdrs, body
| `Body s, `Ok, Ready { encoding=Gzip; _ } when String.length s > 128 ->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We allocate a 1024-byte buffer in Gzip_stream, so maybe a higher threshold makes sense (it feels a bit silly to go ahead and allocate 1K for only 129 bytes of response).

let%lwt body = Gzip_io.string_lwt s in
Lwt.return (("Content-Encoding", "gzip")::hdrs, `Body body)
| _ -> Lwt.return (hdrs, body)
in
let hdrs = match body with
| `Body s -> ("Content-Length", string_of_int (String.length s)) :: hdrs
Expand Down