Check encode type for doanwload task #269

DisaPadla · 2024-10-20T09:20:38Z

Fixes #268

core/download.ts

github-actions · 2024-10-20T12:54:53Z

Think about code testing.
Think about making types more precise. Can you better explain data relations by type?

ai · 2024-10-20T18:20:19Z

core/test/download.test.ts

+
+  expectRequest('https://example.com').andRespond(
+    200,
+    'Hi',


This test will pass even if we haven’t encoding feature.

What do you think if we will use CP1251 and put that broken (when CP1251 was parsed as UTF-8) encoding text here?

If it is too long review, I can do it myself

i will check later

hm, if i just cope paste broken text (like �), it doesnt work correctly. Continue research

Try binary format for symbol

Another way is to get JS code for UTF-8→CP1251 convertation and put:

to1251('тест')

sorry, what do you mean "Try binary format for symbol"?

sorry, what do you mean "Try binary format for symbol"?

String.fromCodePoint

core/download.ts

ai · 2024-10-20T18:21:18Z

core/download.ts

+function detectEncodeType(response: Partial<Response>): string {
+  let headers = response.headers ?? new Headers()
+  let contentType = headers.get('content-type')?.toLowerCase() ?? ''
+  return contentType.match(/charset=([a-zA-Z0-9-]+)/)?.[1] ?? 'utf-8'


Suggested change

return contentType.match(/charset=([a-zA-Z0-9-]+)/)?.[1] ?? 'utf-8'

return contentType.match(/charset=(\w+)/)?.[1] ?? 'utf-8'

Can we do like this?

i think, no. in this case we get utf instead of utf-8

You right. What about [\w-]+?

We also need _

Here is the full list of variants
https://www.iana.org/assignments/character-sets/character-sets.xhtml

maybe better include all symbols except space? something like that?
.match(/charset=([^\s]+)/)

.match(/;\s*charset=([^\s;]+)/)

Technically it could be like Content-Type: text/html;charset=utf-8;boundary=ExampleBoundaryString

oh, you are right. thx

DisaPadla added 2 commits October 20, 2024 10:55

check encode type for download task

eaa1ff3

add unit test for detect encode type

6571deb

ai reviewed Oct 20, 2024

View reviewed changes

core/download.ts Outdated Show resolved Hide resolved

use regex for detecting encode type

6bbc6d6

github-actions bot temporarily deployed to preview-269 October 20, 2024 12:55 Destroyed

run formatter

d8fd553

github-actions bot temporarily deployed to preview-269 October 20, 2024 13:31 Destroyed

run formatter

e078af4

github-actions bot temporarily deployed to preview-269 October 20, 2024 14:34 Destroyed

ai reviewed Oct 20, 2024

View reviewed changes

core/download.ts Outdated Show resolved Hide resolved

ai reviewed Oct 20, 2024

View reviewed changes

rename detectEncodeType to parseEncodeType

a236ddd

github-actions bot deployed to preview-269 October 21, 2024 16:33 View deployment

improve regex for parseEncodeType

d85ecb8

ai force-pushed the main branch from c6d5e41 to 109f826 Compare December 5, 2024 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check encode type for doanwload task #269

Check encode type for doanwload task #269

DisaPadla commented Oct 20, 2024

github-actions bot commented Oct 20, 2024

ai Oct 20, 2024

ai Oct 20, 2024

DisaPadla Oct 21, 2024

DisaPadla Oct 21, 2024

ai Oct 21, 2024

ai Oct 21, 2024

DisaPadla Oct 21, 2024

ai Oct 21, 2024

ai Oct 20, 2024

DisaPadla Oct 21, 2024

ai Oct 21, 2024

ai Oct 21, 2024

DisaPadla Oct 21, 2024

ai Oct 21, 2024

DisaPadla Oct 21, 2024

	return contentType.match(/charset=([a-zA-Z0-9-]+)/)?.[1] ?? 'utf-8'
	return contentType.match(/charset=(\w+)/)?.[1] ?? 'utf-8'

Check encode type for doanwload task #269

Are you sure you want to change the base?

Check encode type for doanwload task #269

Conversation

DisaPadla commented Oct 20, 2024

github-actions bot commented Oct 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment