Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid HTTP request in some cases on replaceDocuments #207

Closed
donhardman opened this issue Apr 17, 2024 · 14 comments
Closed

Invalid HTTP request in some cases on replaceDocuments #207

donhardman opened this issue Apr 17, 2024 · 14 comments
Assignees

Comments

@donhardman
Copy link
Contributor

donhardman commented Apr 17, 2024

As we were updating the docs with recalculated embeddings in the Manticore GitHub search demo, we encountered an issue where replaceDocuments was producing an "Invalid HTTP method" error. We should investigate this and try to understand the reason behind it.

@donhardman donhardman self-assigned this Apr 17, 2024
@donhardman donhardman changed the title Invalid bulk request in some cases on replaceDocuments Invalid HTTP request in some cases on replaceDocuments Apr 17, 2024
@donhardman
Copy link
Contributor Author

The issue stems from the logic when we accumulate headers continuously and reuse curl connections set up previously. This means that all headers from the old connection merge into the new one. If we made an application/json request before and then a bulk request with application/x-ndjson, it results in two header lines being sent. However, the daemon parses only the first line, leading to the "Invalid HTTP method" error.

        $headers = $connection->getHeaders();
        $headers[] = sprintf('Content-Type: %s', $request->getContentType());

To fix this, we need to ensure that on long-running scripts, we don't infinitely extend headers with the same type. We should consider using a key-value map instead.

@donhardman donhardman assigned Nick-S-2018 and unassigned donhardman Apr 18, 2024
@donhardman
Copy link
Contributor Author

I have discovered that while dumping curl info with a curl raw request by adding

curl_setopt($conn, CURLINFO_HEADER_OUT, true);
...
var_dump(curl_getopt($conn));

I discovered that we send proper headers, while still getting a 400 error from the daemon, which is weird:

{
  "url": "http://manticore:9308/bulk",
  "content_type": "application/json; charset=UTF-8",
  "http_code": 400,
  "header_size": 241,
  "request_size": 150,
  "filetime": -1,
  "ssl_verify_result": 0,
  "redirect_count": 0,
  "total_time": 0.063445,
  "namelookup_time": 0.000052,
  "connect_time": 0.000052,
  "pretransfer_time": 0.000067,
  "size_upload": 729799,
  "size_download": 31,
  "speed_download": 492,
  "speed_upload": 11584111,
  "download_content_length": 31,
  "upload_content_length": 729799,
  "starttransfer_time": 0.000068,
  "redirect_time": 0,
  "redirect_url": "",
  "primary_ip": "192.168.128.2",
  "certinfo": [],
  "primary_port": 9308,
  "local_ip": "",
  "local_port": -1,
  "http_version": 2,
  "protocol": 1,
  "ssl_verifyresult": 0,
  "scheme": "HTTP",
  "appconnect_time_us": 0,
  "connect_time_us": 52,
  "namelookup_time_us": 52,
  "pretransfer_time_us": 67,
  "redirect_time_us": 0,
  "starttransfer_time_us": 68,
  "total_time_us": 63445,
  "request_header": "POST /bulk HTTP/1.1\r\nHost: manticore:9308\r\nAccept: */*\r\nAccept-Encoding: deflate, gzip\r\nContent-Type: application/x-ndjson\r\nContent-Length: 729799\r\n\r\n",
  "effective_method": "POST"
}

I tried to write a script that reproduces it with dumped queries into a file:

#!/usr/bin/env bash

endpoint="http://manticore:9308"
requests_prefix="/tmp/data.request."

# Function to send a request and print the result
send_request() {
	local url=${endpoint}/bulk
	curl --silent --show-error --write-out "%{http_code}" \
		--data-binary @"$1" -H 'Content-type: application/x-ndjson' \
		--output /dev/null "${url}"
}

concurrency=$(ls "$requests_prefix"* | wc -l | xargs)
# Loop indefinitely
while true; do
	# Array to store PIDs of curl processes
	pids=()

	# Check if we've reached the maximum number of parallel requests
	while [ ${#pids[@]} -ge ${concurrency} ]; do
		# Wait for any child process to exit
		wait -n
		# Remove exited child process from the array
		pids=("${pids[@]}")
	done

	# Send the request in detached mode and store the PID
	for f in "$requests_prefix"*; do
		send_request "${f}" &
		pids+=($!)
	done

	# Wait for all child processes to exit
	for pid in "${pids[@]}"; do
		wait "${pid}"
	done
done

But unfortunately, it does not fail at all.

@donhardman
Copy link
Contributor Author

After multiple test runs, it seems like the following line helps stabilize the issue:

curl_setopt($conn, CURLOPT_FRESH_CONNECT, true);

@donhardman
Copy link
Contributor Author

As we discussed, we should implement PHP code that will attempt to replicate it.

@donhardman
Copy link
Contributor Author

When running the following script, in some cases, EVEN when the request data is valid, the daemon returns an invalid HTTP request error.

It happens during concurrency.

Sometimes, there is one more error that appears – CURL error occurred: Received HTTP/0.9 when not allowed.

No clue about the reason, but this is what we should investigate.

200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
400 - ERR: {"url":"http:\/\/manticore:9308\/bulk","content_type":"application\/json; charset=UTF-8","http_code":400,"header_size":241,"request_size":180,"filetime":-1,"ssl_verify_result":0,"redirect_count":0,"total_time":0.011459,"namelookup_time":9.6e-5,"connect_time":9.6e-5,"pretransfer_time":0.000195,"size_upload":66,"size_download":25,"speed_download":2272,"speed_upload":6000,"download_content_length":25,"upload_content_length":66,"starttransfer_time":0.01143,"redirect_time":0,"redirect_url":"","primary_ip":"192.168.128.3","certinfo":[],"primary_port":9308,"local_ip":"","local_port":-1,"http_version":2,"protocol":1,"ssl_verifyresult":0,"scheme":"HTTP","appconnect_time_us":0,"connect_time_us":96,"namelookup_time_us":96,"pretransfer_time_us":195,"redirect_time_us":0,"starttransfer_time_us":11430,"total_time_us":11459,"request_header":"POST \/bulk HTTP\/1.1\r\nHost: manticore:9308\r\nAccept: *\/*\r\nContent-Type: application\/x-ndjson\r\nContent-Length: 66\r\n\r\n","effective_method":"POST"}

The table is simple:

create table test(title string);
And create a `run` script with `chmod +x run` after and put the contents with the changed host to manticore:
#!/usr/bin/env php
<?php
$host = 'http://manticore:9308';
$bulkRequest = '{"replace":{"index":"test","id":1,"doc":{"title":"Hello World"}}}';
$concurrency = 24;

while (true) {
	$pids = [];

	// Firs send the request
	$endpoint = "$host/sql";
	$body = "query=SELECT * FROM test LIMIT 20";
	$contentType = "application/json";
	sendRequest($endpoint, $body, $contentType);

	$contentType = 'application/x-ndjson';
	for ($i = 0; $i < $concurrency; $i++) {
		$pid = pcntl_fork();
		if ($pid === 0) {
			usleep(random_int(500000, 1000000));
			$endpoint = "$host/bulk";
			$exit = sendRequest($endpoint, $bulkRequest, $contentType);
			exit($exit);
		} else {
			// Parent process
			$pids[] = $pid;
		}
	}

	foreach ($pids as $pid) {
		pcntl_waitpid($pid, $status, WNOHANG);
	}
}

function sendRequest($endpoint, $body, $contentType) {
	static $ch;
	if (!isset($ch)) {
		$ch = curl_init();
	}
	curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
	curl_setopt($ch, CURLOPT_URL,  $endpoint);
	curl_setopt($ch, CURLOPT_POSTFIELDS, $body);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
	curl_setopt($ch, CURLOPT_HTTPHEADER, ["Content-Type: $contentType"]);
	curl_setopt($ch, CURLINFO_HEADER_OUT, true);
	curl_exec($ch);
	$responseCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
	$error = curl_errno($ch);
	//curl_close($ch);

	echo "$responseCode - "
		. ($responseCode === 200 ? 'OK' : 'ERR: ' . json_encode(curl_getinfo($ch)))
		. "\n";
	flush();
	if ($error > 0) {
		$errorText = curl_error($ch);
		echo "CURL error occured: $errorText\n";
		curl_close($ch);
		$ch = null;
	}
	return $responseCode === 200 ? 0 : 1;
}

Once we run it with ./run, we can see that sometimes errors happen.

Example of usage and results:

$ ./run | grep '^400'
400 - ERR: {"url":"http:\/\/manticore:9308\/bulk","content_type":"application\/json; charset=UTF-8","http_code":400,"header_size":241,"request_size":180,"filetime":-1,"ssl_verify_result":0,"redirect_count":0,"total_time":0.00083,"namelookup_time":8.0e-5,"connect_time":8.0e-5,"pretransfer_time":0.000121,"size_upload":66,"size_download":25,"speed_download":25000,"speed_upload":66000,"download_content_length":25,"upload_content_length":66,"starttransfer_time":0.000804,"redirect_time":0,"redirect_url":"","primary_ip":"192.168.128.3","certinfo":[],"primary_port":9308,"local_ip":"","local_port":-1,"http_version":2,"protocol":1,"ssl_verifyresult":0,"scheme":"HTTP","appconnect_time_us":0,"connect_time_us":80,"namelookup_time_us":80,"pretransfer_time_us":121,"redirect_time_us":0,"starttransfer_time_us":804,"total_time_us":830,"request_header":"POST \/bulk HTTP\/1.1\r\nHost: manticore:9308\r\nAccept: *\/*\r\nContent-Type: application\/x-ndjson\r\nContent-Length: 66\r\n\r\n","effective_method":"POST"}

@donhardman donhardman assigned tomatolog and unassigned Nick-S-2018 Apr 29, 2024
@tomatolog
Copy link

seems error in the script if I remove static $ch; and use local variable scope - all works fine

$ch = curl_init();

I not sure how static $ch; works with fork but from my point it has issue there as with local ch issue gone

@tomatolog
Copy link

tomatolog commented Apr 29, 2024

found the issue report about the curl and php pcntl_fork php/php-src#10633 that seems similar to this case.
The discussion users think that curl shares state during the fork and this way affects the requests and responses

@tomatolog tomatolog assigned donhardman and unassigned tomatolog Apr 29, 2024
@donhardman
Copy link
Contributor Author

I think it's a good idea to get rid of the static cURL instance and instead keep it per object instance:

https://github.com/manticoresoftware/manticoresearch-php/blob/master/src/Manticoresearch/Transport/Http.php#L27

@donhardman donhardman assigned Nick-S-2018 and unassigned donhardman Apr 30, 2024
@Nick-S-2018
Copy link
Collaborator

@donhardman Do we really need to get rid of that? I mean those persistent connections were declared as one of the client's 'features'. If we don't want them to be used for a specific application, we can just disable them through the client's config.

@donhardman
Copy link
Contributor Author

We should keep it, but in a proper way – use a non-static variable that sticks to the class. That way, when we don't create multiple instances of the Client, it will still work as expected. However, when we do create instances, we're probably doing it on purpose, so we expect not to keep the original state from some random instance used before and not in the current namespace (like a fork).

@donhardman
Copy link
Contributor Author

The case here – the code with forking is implemented correctly.

I did not use a client created from the parent process, and instead created a new Client instance inside the forked process. However, since we have a static variable, we are using it from the parent process, which causes side effects.

To resolve this issue properly, we should keep the same approach but use an instance variable instead of a static variable.

@sanikolaev
Copy link
Collaborator

@donhardman
Copy link
Contributor Author

Looks good! I've double-checked everything, and it all looks great. We can close this issue once the pull request is merged. 👍

@Nick-S-2018
Copy link
Collaborator

Done in #209

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants