content-length on HEAD object #805

Closed
opened 2024-04-11 19:19:07 +00:00 by sebadob · 6 comments

Hey,

I just stumbled about a difference to the behavior of Minio, which I have been using until now, on HEAD requests for an object.
When I do a HEAD, Minio sets the content-length header to the actual file / object size in bytes, while Garage seems to set the value to the length of the actual HEAD request itself (I guess).
I used HEAD to fetch information about file sizes of my object on S3. This logic breaks of course when Garage is returning a different value.

I am not sure which is the correct implementation. I am currently writing a minimal, fast s3 bucket crud client with connection pooling and being optimized to serve lots of files frequently. The currently existing Rust implementations have all their flaws unfortunately.
I just stumbled about it because some of my test cases started failing when I switched to Garage.

I am currently running

Garage version: git:v0.9.3-36-g95eb8808 [features: lmdb, sqlite, kubernetes-discovery, metrics, bundled-libs]

Edit:

I just created a new container image for v1.0.0 and tested with it, which returns the same results.

Hey, I just stumbled about a difference to the behavior of Minio, which I have been using until now, on `HEAD` requests for an object. When I do a `HEAD`, Minio sets the `content-length` header to the actual file / object size in bytes, while Garage seems to set the value to the length of the actual `HEAD` request itself (I guess). I used `HEAD` to fetch information about file sizes of my object on S3. This logic breaks of course when Garage is returning a different value. I am not sure which is the correct implementation. I am currently writing a minimal, fast s3 bucket crud client with connection pooling and being optimized to serve lots of files frequently. The currently existing Rust implementations have all their flaws unfortunately. I just stumbled about it because some of my test cases started failing when I switched to Garage. I am currently running ``` Garage version: git:v0.9.3-36-g95eb8808 [features: lmdb, sqlite, kubernetes-discovery, metrics, bundled-libs] ``` Edit: I just created a new container image for v1.0.0 and tested with it, which returns the same results.
Owner

Does this work correctly on older versions such as 0.9.0 and 0.8.6?

The code should return the size of the object as saved in the object's metadata, bit maybe that information is wrong, although that would be strange as it should cause asserts or errors at other places

Does this work correctly on older versions such as 0.9.0 and 0.8.6? The code should return the size of the object as saved in the object's metadata, bit maybe that information is wrong, although that would be strange as it should cause asserts or errors at other places
Author

Does this work correctly on older versions such as 0.9.0 and 0.8.6?

I can do some tests later. So far I tested only 0.9.3 and the stable v1.0.0.

I did add some integration tests for the library and when Minio returns exactly 128_000 in the content_length for my exactly 128kB file, I got something like 221 back from Garage.

I will report back when I tested against 0.9.0 or 0.8.6. I just have to build container images for this, because I am using it inside K8s and I did not find a way to interact with the cluster easily without having access to the garage binary inside the container. Maybe I missed something there, but I basically build my own images with almalinux:minimal to then be able to alias garage='kubectl -n garage exec -it garage-0 -- ./garage'.

But I will do tests and give feedback.
I could also extract the raw answer from Garage from the test case.

> Does this work correctly on older versions such as 0.9.0 and 0.8.6? I can do some tests later. So far I tested only 0.9.3 and the stable v1.0.0. I did add some integration tests for the library and when Minio returns exactly `128_000` in the `content_length` for my exactly 128kB file, I got something like 221 back from Garage. I will report back when I tested against 0.9.0 or 0.8.6. I just have to build container images for this, because I am using it inside K8s and I did not find a way to interact with the cluster easily without having access to the `garage` binary inside the container. Maybe I missed something there, but I basically build my own images with [almalinux:minimal](https://hub.docker.com/_/almalinux) to then be able to `alias garage='kubectl -n garage exec -it garage-0 -- ./garage'`. But I will do tests and give feedback. I could also extract the raw answer from Garage from the test case.
Author

I was actually able to solve it.

I digged deeper into the problem and it was partly my fault as well.
I did not check the return code, only extracted headers and saw the content-length not being correct.

What I actually got back from Garage was an HTTP 400 with an empty body, body the following headers:

{"content-length": "214", "content-type": "application/xml", "date": "Fri, 12 Apr 2024 12:06:30 GMT"}

The reason for this was actually pretty tricky. I only found it because I found problems with DeleteObject and GetObjectRange as well, which returned the following error:

<?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidRequest</Code><Message>Bad request: signed header `content-length` is not present</Message><Resource>/test/test_data_128000</Resource><Region>home</Region></Error>

The content-length was present. I always added it to the signed headers by default, even when there was no content and then content-length was actually 0. This worked fine with Minio, but Garage returned an error that the header would not be there, even when it was. I don't know the reason for this.

As soon as I added excluded the content-length: 0 for:

  • DeleteObject
  • GetObjectRange
  • HeadObject

the problems all went away.

There is a difference in API behavior to Minio though, which I am using as a reference here, but this was partly my fault, sorry about the issue in that case.

I know that a few currently existing rust s3 crates add the content-length: 0 as well, because I re-used parts of the signature algorithm from existing crates, but with the exclusion it works for Minio and Garage at the same time, so I will just leave it out for compatibility.
I don't know if you want to investigate this difference further, but I guess the issue can be closed.

Edit:

There is just one additional thing I noticed in this case.
Even when I add content-length: 0 while doing a GetObjectRange, I receive an HTTP 400 but actually with the body I would only expect from an HTTP 206.

I was actually able to solve it. I digged deeper into the problem and it was partly my fault as well. I did not check the return code, only extracted headers and saw the `content-length` not being correct. What I actually got back from Garage was an `HTTP 400` with an empty body, body the following headers: ```json {"content-length": "214", "content-type": "application/xml", "date": "Fri, 12 Apr 2024 12:06:30 GMT"} ``` The reason for this was actually pretty tricky. I only found it because I found problems with `DeleteObject` and `GetObjectRange` as well, which returned the following error: ```xml <?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidRequest</Code><Message>Bad request: signed header `content-length` is not present</Message><Resource>/test/test_data_128000</Resource><Region>home</Region></Error> ``` The `content-length` was present. I always added it to the signed headers by default, even when there was no content and then `content-length` was actually `0`. This worked fine with Minio, but Garage returned an error that the header would not be there, even when it was. I don't know the reason for this. As soon as I added excluded the `content-length: 0` for: - `DeleteObject` - `GetObjectRange` - `HeadObject` the problems all went away. There is a difference in API behavior to Minio though, which I am using as a reference here, but this was partly my fault, sorry about the issue in that case. I know that a few currently existing rust s3 crates add the `content-length: 0` as well, because I re-used parts of the signature algorithm from existing crates, but with the exclusion it works for Minio and Garage at the same time, so I will just leave it out for compatibility. I don't know if you want to investigate this difference further, but I guess the issue can be closed. Edit: There is just one additional thing I noticed in this case. Even when I add `content-length: 0` while doing a `GetObjectRange`, I receive an `HTTP 400` but actually with the body I would only expect from an `HTTP 206`.

Hello @sebadob -- I am stuck on the same issue as you, but because I've been trying to use the ExAws.S3 library for Elixir with garage, I am at a loss at how to work around this.

"As soon as I added excluded the content-length: 0" -> should I add or exclude this header parameter for garage?

I'm using bruno (postman alternative) to make a HEAD request to the bucket for a known existing key. I'm using AWS Sig V4 authentication settings and provide my Access Key ID, my Secret Access Key, and the "garage" bucket.

alt-svc	h3=":443"; ma=2592000
content-length	210
content-type	application/xml
date	Mon, 24 Jun 2024 09:34:51 GMT

This is the response headers.

Interestingly, generating a presigned URL for a GET request works. However, generating a presigned URL for a HEAD request results in this:

<Error>
<Code>AccessDenied</Code>
<Message>Forbidden: Invalid signature</Message>
<Resource>/Ubooquity-2.1.2.zip</Resource>
<Region>garage</Region>
</Error>

There must be a mismatch between what the signature signs (which includes the headers) and what the actual headers are. Any ideas?

Hello @sebadob -- I am stuck on the same issue as you, but because I've been trying to use the ExAws.S3 library for Elixir with garage, I am at a loss at how to work around this. "As soon as I added excluded the content-length: 0" -> should I add or exclude this header parameter for garage? I'm using bruno (postman alternative) to make a HEAD request to the bucket for a known existing key. I'm using AWS Sig V4 authentication settings and provide my Access Key ID, my Secret Access Key, and the "garage" bucket. ``` alt-svc h3=":443"; ma=2592000 content-length 210 content-type application/xml date Mon, 24 Jun 2024 09:34:51 GMT ``` This is the response headers. Interestingly, generating a presigned URL for a GET request works. However, generating a presigned URL for a HEAD request results in this: ```xml <Error> <Code>AccessDenied</Code> <Message>Forbidden: Invalid signature</Message> <Resource>/Ubooquity-2.1.2.zip</Resource> <Region>garage</Region> </Error> ``` There must be a mismatch between what the signature signs (which includes the headers) and what the actual headers are. Any ideas?

Here's something new I found out when enabling debug logs:

2024-06-24T09:45:49.080652Z DEBUG garage_api::generic_server: Endpoint: HeadObject
2024-06-24T09:45:49.080719Z  INFO garage_api::generic_server: Response: error 400 Bad Request, Bad request: signed header `content-length` is not present

Also, using awscli works:

$ aws s3api head-object --bucket pragmata-dev --key Ubooquity-2.1.2.zip

This is what I get in the garage logs:

{
    "accept-encoding": "identity",
    "authorization": "AWS4-HMAC-SHA256 Credential=XXXXX/20240624/garage/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=XXXXX",
    "host": "XXXXX",
    "user-agent": "aws-cli/2.9.19 Python/3.11.2 Linux/6.7.4-overbring source/x86_64.debian.12 prompt/off command/s3api.head-object",
    "x-amz-content-sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "x-amz-date": "20240624T095532Z",
    "x-forwarded-for": "XXXXX",
    "x-forwarded-host": "s3.XXXXX.com",
    "x-forwarded-proto": "https"
}

Here, content-length is entirely missing.

Edit: I fixed it by overriding ExAws's code for head_object to avoid adding the content-length header and thus also not including it into the SignedHeaders list of the authorization header.

Can we conclude from this that the spec is that content-length should not be included in the headers for HeadObject?

Here's something new I found out when enabling debug logs: ``` 2024-06-24T09:45:49.080652Z DEBUG garage_api::generic_server: Endpoint: HeadObject 2024-06-24T09:45:49.080719Z INFO garage_api::generic_server: Response: error 400 Bad Request, Bad request: signed header `content-length` is not present ``` Also, using awscli works: ```text $ aws s3api head-object --bucket pragmata-dev --key Ubooquity-2.1.2.zip ``` This is what I get in the garage logs: ```json { "accept-encoding": "identity", "authorization": "AWS4-HMAC-SHA256 Credential=XXXXX/20240624/garage/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=XXXXX", "host": "XXXXX", "user-agent": "aws-cli/2.9.19 Python/3.11.2 Linux/6.7.4-overbring source/x86_64.debian.12 prompt/off command/s3api.head-object", "x-amz-content-sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "x-amz-date": "20240624T095532Z", "x-forwarded-for": "XXXXX", "x-forwarded-host": "s3.XXXXX.com", "x-forwarded-proto": "https" } ``` Here, `content-length` is entirely missing. Edit: I fixed it by overriding ExAws's code for `head_object` to avoid adding the `content-length` header and thus also not including it into the `SignedHeaders` list of the `authorization` header. Can we conclude from this that the spec is that `content-length` should not be included in the headers for HeadObject?
[Related PR on `ex_aws` on Github](https://github.com/ex-aws/ex_aws/pull/1067)
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#805
No description provided.