The AWS DataSync performance trick that I didn’t know

My current project is a datacenter migration where we’re moving Windows workloads up to AWS. One item we had been struggling with was with two large file share servers. The shares are multiple terabytes in size with lots of small files/docs and subfolders. The challenge we had been dealing with the performance of DataSync with several of the large file shares. One share in particular is around 5 TB (yikes!). Just to scan the metadata of the share was taking over 48 hours. In planning for a cutover, that seemed impossible to deal with.

THE SOLUTION

We ended up testing the sync location to use two sync agents to speed up the scan/copy process and the job’s time dropped to a scan of under 60 minutes 😎.

This is noted in the DataSync documentation…from the doc:
“For most workloads, we recommend that you use one AWS DataSync agent for each self-managed location. However, there are exceptions:

  • Some workloads have tens of millions of small files. In these cases, we recommend up to four agents for each location.
  • If your network has limited bandwidth (for example, an agent is on a network link with less than 2.5 Gbps), we recommend four agents for each location.”

We are planning on adding two additional agents to hopefully make the scan of metadata even faster.

Happy building,

D

The AWS DataSync performance trick that I didn’t know

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top