Support for PostObject #222

Merged
lx merged 14 commits from trinity-1686a/garage:post-object into main 2022-02-21 22:02:31 +00:00

Add support for PostObject

Todo/done:

  • routing PostObject properly
  • parsing multipart body
  • validating signature
  • validating policy
  • validating content length
  • actually saving data
Add support for [PostObject](https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPOST.html) Todo/done: - [x] routing PostObject properly - [x] parsing multipart body - [x] validating signature - [x] validating policy - [x] validating content length - [x] actually saving data
trinity-1686a force-pushed post-object from 569f7496ae to 73d234c893 2022-02-05 12:48:10 +00:00 Compare
trinity-1686a force-pushed post-object from 30e8e885a5 to 6a42cd027e 2022-02-05 12:52:49 +00:00 Compare
trinity-1686a force-pushed post-object from 6a42cd027e to 853e18bd44 2022-02-05 13:01:00 +00:00 Compare
trinity-1686a force-pushed post-object from 15fdbb48e2 to 11bb514b34 2022-02-05 23:13:35 +00:00 Compare
trinity-1686a reviewed 2022-02-07 18:51:55 +00:00
@ -0,0 +217,4 @@
.status(StatusCode::OK)
.body(Body::empty())?,
"201" => {
// TODO body should be an XML document, not sure which yet
Author
Owner

the actual response should be

<PostResponse>
    <Location>https://bucketname.garage.tld/key</Location>
    <Bucket>bucketname</Bucket>
    <Key>key</Key>
    <ETag>"0123456789abcdef0123456789abcdef"</ETag>
</PostResponse>

with corresponding etag and location http headers (these headers are also here for 200 and 204, but not the body)

When using success_action_redirect, etag is set as usual, and location is set to ${success_action_redirect}?bucket=bucketname&key=key&etag=%220123456789abcdef0123456789abcdef%22

the actual response should be ```xml <PostResponse> <Location>https://bucketname.garage.tld/key</Location> <Bucket>bucketname</Bucket> <Key>key</Key> <ETag>"0123456789abcdef0123456789abcdef"</ETag> </PostResponse> ``` with corresponding `etag` and `location` http headers (these headers are also here for 200 and 204, but not the body) When using `success_action_redirect`, `etag` is set as usual, and location is set to `${success_action_redirect}?bucket=bucketname&key=key&etag=%220123456789abcdef0123456789abcdef%22`
trinity-1686a force-pushed post-object from 4f52708111 to f551062456 2022-02-07 18:55:11 +00:00 Compare
trinity-1686a changed title from WIP: Support for PostObject to Support for PostObject 2022-02-07 18:55:25 +00:00
lx reviewed 2022-02-08 22:11:04 +00:00
@ -0,0 +43,4 @@
let mut multipart = Multipart::with_constraints(body, boundary, constraints);
let mut params = HeaderMap::new();
while let Some(field) = multipart.next_field().await? {
Owner

I think we can avoid putting almost the entire code of this function in the while loop (and remove 1 indentation level almost everywhere) by doing something like this:

let file = loop {
    let field = match multipart.next_field().await? {
        None => return Err(no file field found),
        Some(x) => x,
    };
    let name = ...;
    if name == "file" {
         break field;
    }
    // here handle header field adding it to the headermap
};
// here do all of the rest of the processing once we have all headers and are now reading the file body

This looks much nicer to me, especially as in the current version we have a for inside the while, which looks a bit like a nested loop but is in fact not at all.

I think we can avoid putting almost the entire code of this function in the `while` loop (and remove 1 indentation level almost everywhere) by doing something like this: ```rust let file = loop { let field = match multipart.next_field().await? { None => return Err(no file field found), Some(x) => x, }; let name = ...; if name == "file" { break field; } // here handle header field adding it to the headermap }; // here do all of the rest of the processing once we have all headers and are now reading the file body ``` This looks much nicer to me, especially as in the current version we have a `for` inside the `while`, which looks a bit like a nested loop but is in fact not at all.
trinity-1686a marked this conversation as resolved
@ -0,0 +235,4 @@
.and_then(|h| h.to_str().ok())
.unwrap_or("204");
let builder = Response::builder()
.status(StatusCode::OK)
Owner

Looks like we don't need this .status() as we are calling it in all branches below

Looks like we don't need this `.status()` as we are calling it in all branches below
trinity-1686a marked this conversation as resolved
trinity-1686a force-pushed post-object from e5d1856f17 to 46c769ca00 2022-02-11 23:22:39 +00:00 Compare
trinity-1686a force-pushed post-object from f1ebcad654 to 16cd481d1a 2022-02-12 13:14:23 +00:00 Compare
lx requested changes 2022-02-14 16:53:42 +00:00
lx left a comment
Owner

I feel like there are a few things concerning policy validation that require clarification.

For the unit test I will write it and put it in the comment so that you can just paste it in your code.

I feel like there are a few things concerning policy validation that require clarification. For the unit test I will write it and put it in the comment so that you can just paste it in your code.
@ -0,0 +111,4 @@
if let Some(filename) = field.file_name() {
key.replace("${filename}", filename)
} else {
key.to_owned()
Owner

This looks like it should rather be an error case: if the application builder put ${filename} in the key field but the browser for some reason didn't include the name of the uploaded file, we probably want to reject the request.

This looks like it should rather be an error case: if the application builder put `${filename}` in the key field but the browser for some reason didn't include the name of the uploaded file, we probably want to reject the request.
Author
Owner

that was actually how a previous iteration worked. This behavior was added because it's exactly how AWS behave (not that I can say I agree with that behavior)

that was actually how a previous iteration worked. This behavior was added because it's exactly how AWS behave (not that I can say I agree with that behavior)
Owner

:/

:/
trinity-1686a marked this conversation as resolved
@ -0,0 +141,4 @@
));
}
let mut conditions = decoded_policy.into_conditions()?;
Owner

I feel like we should add unit tests for the policy decoding logic (not just for into_conditions but end-to-end starting with JSON)

I feel like we should add unit tests for the policy decoding logic (not just for into_conditions but end-to-end starting with JSON)
Owner

I'll write a test and post it in the comments of the PR so that you can copy and paste

I'll write a test and post it in the comments of the PR so that you can copy and paste
lx marked this conversation as resolved
@ -0,0 +167,4 @@
"key" => {
let conds = conditions.params.remove("key").ok_or_else(|| {
Error::BadRequest(format!("Key '{}' is not allowed in policy", param_key))
})?;
Owner

Can you point me to the documentation section which says that there must be a policy specified for the key field?

Can you point me to the documentation section which says that there must be a policy specified for the `key` field?
Author
Owner

this is not specified, it's however the behavior of AWS. policy and x-amz-signature are the only two fields I found to not be required in the policy. Even x-amz-credential, which is definitelly required to make a valid v4 signature, must be allowed in policy
this is in fact specified somewhere

Each form field that you specify in a form (except x-amz-signature, file, policy, and field names that have an x-ignore- prefix) must appear in the list of conditions.

Which means I have to add some code to ignore x-ignore-*, others are already ignored

~~this is not specified, it's however the behavior of AWS. `policy` and `x-amz-signature` are the only two fields I found to not be required in the policy. Even `x-amz-credential`, which is definitelly required to make a valid v4 signature, must be allowed in policy~~ this is in fact [specified somewhere](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-HTTPPOSTConstructPolicy.html#sigv4-PolicyConditions) > Each form field that you specify in a form (except x-amz-signature, file, policy, and field names that have an x-ignore- prefix) must appear in the list of conditions. Which means I have to add some code to ignore `x-ignore-*`, others are already ignored
lx marked this conversation as resolved
@ -0,0 +184,4 @@
_ => {
let conds = conditions.params.remove(&param_key).ok_or_else(|| {
Error::BadRequest(format!("Key '{}' is not allowed in policy", param_key))
})?;
Owner

Same here, can you point me to the doc which says a policy must be given for all fields?

Same here, can you point me to the doc which says a policy must be given for all fields?
Author
Owner

see comment on key

see comment on `key`
trinity-1686a marked this conversation as resolved
@ -0,0 +216,4 @@
.ok_or_bad_request("Invalid content type")?
.unwrap_or_else(|| HeaderValue::from_static("blob"));
params.append(header::CONTENT_TYPE, content_type);
Owner

This definitely looks like it should have been done before the policy check

This definitely looks like it should have been done before the policy check
Author
Owner

turns out AWS ignore this, and only consider content type set in what I called param, not in the field metadata

turns out AWS ignore this, and only consider content type set in what I called param, not in the field metadata
lx marked this conversation as resolved
@ -0,0 +384,4 @@
struct Conditions {
params: HashMap<String, Vec<Operation>>,
content_type: Vec<Operation>,
Owner

Why do we need content_type to be set apart here? If it has different treatment I think that's applied at a later moment when we check the set of headers against the policy, here it could just be added to the map with the rest (the same that is done with key).

Why do we need content_type to be set apart here? If it has different treatment I think that's applied at a later moment when we check the set of headers against the policy, here it could just be added to the map with the rest (the same that is done with `key`).
lx marked this conversation as resolved
Owner

Here are some simple tests for policy decoding (they require adding #[derive(Debug)] to Operation):

#[cfg(test)]
mod tests {
	use super::*;

	#[test]
	fn test_policy_1() {
		let policy_json = br#"
{ "expiration": "2007-12-01T12:00:00.000Z",
  "conditions": [
    {"acl": "public-read" },
    {"bucket": "johnsmith" },
    ["starts-with", "$key", "user/eric/"]
  ]
}	
		"#;
		let policy_2: Policy = serde_json::from_slice(&policy_json[..]).unwrap();
		let mut conditions = policy_2.into_conditions().unwrap();

		assert_eq!(
			conditions.params.remove(&"acl".to_string()),
			Some(vec![Operation::Equal("public-read".into())])
		);
		assert_eq!(
			conditions.params.remove(&"bucket".to_string()),
			Some(vec![Operation::Equal("johnsmith".into())])
		);
		assert_eq!(
			conditions.params.remove(&"key".to_string()),
			Some(vec![Operation::StartsWith("user/eric/".into())])
		);
		assert!(conditions.params.is_empty());
		assert_eq!(conditions.content_length, 0..=u64::MAX);
	}


	#[test]
	fn test_policy_2() {
		let policy_json = br#"
{ "expiration": "2007-12-01T12:00:00.000Z",
  "conditions": [
	[ "eq", "$acl", "public-read" ],
	["starts-with", "$Content-Type", "image/"],
	["starts-with", "$success_action_redirect", ""],
	["content-length-range", 1048576, 10485760]
  ]
}	
		"#;
		let policy_2: Policy = serde_json::from_slice(&policy_json[..]).unwrap();
		let mut conditions = policy_2.into_conditions().unwrap();

		assert_eq!(
			conditions.params.remove(&"acl".to_string()),
			Some(vec![Operation::Equal("public-read".into())])
		);
		assert_eq!(
			conditions.content_type,  // TODO change this if content_type becomes a field like others
			vec![Operation::StartsWith("image/".into())]
		);
		assert_eq!(
			conditions.params.remove(&"success_action_redirect".to_string()),
			Some(vec![Operation::StartsWith("".into())])
		);
		assert!(conditions.params.is_empty());
		assert_eq!(conditions.content_length, 1048576..=10485760);
	}
}
Here are some simple tests for policy decoding (they require adding `#[derive(Debug)]` to `Operation`): ```rust #[cfg(test)] mod tests { use super::*; #[test] fn test_policy_1() { let policy_json = br#" { "expiration": "2007-12-01T12:00:00.000Z", "conditions": [ {"acl": "public-read" }, {"bucket": "johnsmith" }, ["starts-with", "$key", "user/eric/"] ] } "#; let policy_2: Policy = serde_json::from_slice(&policy_json[..]).unwrap(); let mut conditions = policy_2.into_conditions().unwrap(); assert_eq!( conditions.params.remove(&"acl".to_string()), Some(vec![Operation::Equal("public-read".into())]) ); assert_eq!( conditions.params.remove(&"bucket".to_string()), Some(vec![Operation::Equal("johnsmith".into())]) ); assert_eq!( conditions.params.remove(&"key".to_string()), Some(vec![Operation::StartsWith("user/eric/".into())]) ); assert!(conditions.params.is_empty()); assert_eq!(conditions.content_length, 0..=u64::MAX); } #[test] fn test_policy_2() { let policy_json = br#" { "expiration": "2007-12-01T12:00:00.000Z", "conditions": [ [ "eq", "$acl", "public-read" ], ["starts-with", "$Content-Type", "image/"], ["starts-with", "$success_action_redirect", ""], ["content-length-range", 1048576, 10485760] ] } "#; let policy_2: Policy = serde_json::from_slice(&policy_json[..]).unwrap(); let mut conditions = policy_2.into_conditions().unwrap(); assert_eq!( conditions.params.remove(&"acl".to_string()), Some(vec![Operation::Equal("public-read".into())]) ); assert_eq!( conditions.content_type, // TODO change this if content_type becomes a field like others vec![Operation::StartsWith("image/".into())] ); assert_eq!( conditions.params.remove(&"success_action_redirect".to_string()), Some(vec![Operation::StartsWith("".into())]) ); assert!(conditions.params.is_empty()); assert_eq!(conditions.content_length, 1048576..=10485760); } } ```
trinity-1686a force-pushed post-object from fe9cedfdf5 to a19079ba1a 2022-02-16 22:30:51 +00:00 Compare
trinity-1686a force-pushed post-object from a19079ba1a to 0fae509b29 2022-02-16 22:30:59 +00:00 Compare
trinity-1686a force-pushed post-object from 0fae509b29 to 3536903a58 2022-02-16 22:44:14 +00:00 Compare
trinity-1686a force-pushed post-object from 3536903a58 to cea91e7a7d 2022-02-16 22:50:11 +00:00 Compare
trinity-1686a force-pushed post-object from cea91e7a7d to 32864bb8f3 2022-02-16 22:52:37 +00:00 Compare
trinity-1686a requested review from lx 2022-02-16 23:12:06 +00:00
lx approved these changes 2022-02-17 08:07:31 +00:00
lx left a comment
Owner

LGTM

LGTM
lx merged commit f6f8b7f1ad into main 2022-02-21 22:02:31 +00:00
lx referenced this issue from a commit 2022-02-21 22:02:31 +00:00
Sign in to join this conversation.
No description provided.