Multipart Uploads
Multipart Uploads can be accomplished one of two ways, using either the raw request methods or using the higher-level method startMultipartUpload
. Using the low-level request methods requires the handling of the uploadId
of the upload and the ETags
of the uploaded parts. The higher level method produces a Resource[F, MultipartUpload]
which can be used.
Multipart Uploads Requests​
Multipart uploads can be accomplished by using the raw requests to S3. These can be used in cases where a large file is required to be uploaded and should be seperated into smaller parts for upload. It is important to note that a file should be seperated into parts of at minumum of 5 MB in size, the final part is an exception to this however.
All methods described here support additional optional headers that can be included in the request. Please refer to the S3 Documentation for more information on what headers can be applied to different methods.
A multipart upload begins by using the createMultipartUpload
method. This method produces the uploadId
which is needed by other requests.
s3.createMultipartUpload("hello-world-bucket-example", "mp-file-example").flatMap { response =>
IO { println(s"UploadID: ${response.uploadId}") }
}
// res0: IO[Unit] = IO$551448852
If, for any reason, the multipart upload is needed to be cancelled, this can be done with the abortMultipartUpload
method.
s3.abortMultipartUpload(
"hello-world-bucket-example",
"mp-file-example",
uploadId = "943465sdf54sdf654sd321fdf")
// res1: IO[org.http4s.Headers] = IO$766173839
The listMultipartUploads
method can be used to see ongoing uploads within a bucket. This also obtains the uploadId
for each upload.
def printMP(upload: Uploads): IO[Unit] = IO { println(s"UploadID: ${upload.uploadId}") }
s3.listMultipartUploads("hello-world-bucket-example").flatMap { response =>
response.uploads.get.traverse(printMP _)
}
// res2: IO[Vector[Unit]] = IO$1347867045
A list of parts uploaded so far in a multipart upload can be obtained using the listParts
method. This returns each part's ETag
which is necessary for completing the upload.
def printPart(part: Parts): IO[Unit] = IO { println(s"Part Etag: ${part.eTag}") }
s3.listParts(
"hello-world-bucket-example",
"mp-file-example",
uploadId = "943465sdf54sdf654sd321fdf")
.flatMap { response => response.parts.traverse(printPart _) }
// res3: IO[Vector[Unit]] = IO$1895583351
To upload a part, the uploadPart
method is to be used. This requires an EntityEncoder
in order to encode the contents of the chunk. It also requires a part number which is necessary for reassembling the parts into the completed file. In S3, part numbers are indexed from 1 to 10,000. This also returns the ETag
of the part, which is also necessary for completing the upload.
s3.uploadPart(
"hello-world-bucket-example",
"mp-file-example",
1,
"943465sdf54sdf654sd321fdf",
"content-to -be-encoded")
.flatMap { response => IO { println(s"Part ETag: ${response.eTag}") } }
// res4: IO[Unit] = IO$595187629
Once all parts are uploaded, the upload can be completed. This is done using the method completeMultipartUpload
method. This method requires all ETags
for each part in a list, in order of assembly.
s3.completeMultipartUpload(
"hello-world-bucket-example",
"mp-file-example",
uploadId = "943465sdf54sdf654sd321fdf",
parts = List("9320f0j32f0j23f0j382jf", "9320f0j32f0mg59khf32jf"))
// res5: IO[CompleteMultipartUploadResponse] = IO$1497233367
Higher-Level Multipart Upload Method​
To use the higher-level multipart upload, first invoke the startMultipartUpload
method. This returns a Resource[F, MultipartUpload]
.
This Resource
includes the sendPart
method which is used to send a file chunk. The Resource automatically keeps track of returned ETags
and generates part numbers upon uploading another part. Upon the release of the Resource
, the multipart upload is completed automatically. Parts sent to upload must be sent in the order that should be reassembled. If the Resource
is cancelled or an error occurs, the multipart upload is aborted.
import org.http4s.EntityEncoder
//For docs purposes, send a singular part t.
def upload[T](t: T)(implicit enc: EntityEncoder[IO, T]) =
s3.startMultipartUpload("bucket", "key").use { mp => mp.sendPart(t) }