DEV Community

Tool for fast deletion and emptying of S3 buckets (versioning supported)

Kenta Goto on June 28, 2023

I have released an OSS tool that solves the following problems. The button to empty an S3 bucket is present in the S3 console but not in the CLI ...
Collapse
Β 
mmuller88 profile image
Martin Muller πŸ‡©πŸ‡ͺπŸ‡§πŸ‡·πŸ‡΅πŸ‡Ή AWS Community Builders β€’

Nice. Thank you so much :)!

Collapse
Β 
Sloan, the sloth mascot
Comment deleted
Collapse
Β 
Sloan, the sloth mascot
Comment deleted
Β 
Sloan, the sloth mascot
Comment deleted
Collapse
Β 
karlforster profile image
karlforster β€’

This tool is amazing. Thanks so much.
Regarding concurrency numbers, is there any indication of how high we can go before AWS start rejecting.

I am running at the moment with 5 concurrency to remove 28 million items in a bucket.

but have another one with 50 million items and 1.8TB worth of Data.

Collapse
Β 
k_goto profile image
Kenta Goto AWS Community Builders β€’

Thank you @karlforster !

Actually, the concurrency number in cls3 controls how many buckets are deleted in parallel β€” it doesn't control parallelism for the objects within a single bucket.

That's because cls3 deletes objects using the following loop, which alternates between a synchronous List and an asynchronous Delete:

  • 1000-object ListObjectVersions (sync) β†’ 1000-object DeleteObjects (async, immediately moves on) β†’ next 1000-object List (sync) β†’ next 1000-object DeleteObjects (async) β†’ ...

You can't delete an object without listing it first, so this design gives the best performance. Both List and DeleteObjects can handle only up to 1000 items per call, and List has to fetch the current page before it can move on to the next one β€” so within a single bucket, deletes are fired asynchronously in the background of the next List, rather than running in parallel.

NOTE: S3's DeleteObjects API is designed to throttle once you exceed roughly 3,500 deletions per second. However, each DeleteObjects call typically finishes during the time the next List is running, so we don't put a cap on the number of in-flight async deletes.

Collapse
Β 
rahmantheman profile image
Rahman Badru β€’

love this

Collapse
Β 
k_goto profile image
Kenta Goto AWS Community Builders β€’

Thank you! So happy!

Collapse
Β 
andrew_61cd5f1f140a profile image
Andrew β€’

Would it be possible to use this with custom endpoint-url's?

Collapse
Β 
k_goto profile image
Kenta Goto AWS Community Builders β€’

@andrew_61cd5f1f140a
Hi, The custom endpoint url is now supported in v0.28.0. It can be used with the -e|--endpointUrl option or by specifying the environment variable CLS3_ENDPOINT_URL.

github.com/go-to-k/cls3/releases/t...

Collapse
Β 
k_goto profile image
Kenta Goto AWS Community Builders β€’

Awesome. I will try to do that in the near future!

github.com/go-to-k/cls3/issues/363

Collapse
Β 
k_goto profile image
Kenta Goto AWS Community Builders β€’

@mmuller88
Thank you too! Please use it!