Slow File Sprays with V7.6 versus V6.4

Environment

LnP Infra : http://10.144.162.8:8010

Description

We are testing file transfers when upgrading from V6 to V7.  We are seeing an extreme slowness when the file transfer is first initiated. It is taking an long time for the system to recognize the number of files it has transfer before the transfer starts.  The more files we transfer at once the longer the transfer takes.  Who can help me in addressing this issues?   Should I put in a Jira?

 

For the test our Linux Source IP 10.172.78.79 was running DAFILSRV from build hpccsystems-platform-community_6.4.40-1.el7.x86_64.rpm.

The box has the same connectivity to both our Bocca DEV and Boca Infra environments, which are actually running on the same hardware in Boca.

The data resides on our file system and contains 263,753 files under 834 directories. (This has been are normal transfer situation for 9 years)

 

You can see from below it is taking 85 times longer to determine the number of documents to transfer from V6 to V7. Once the transfer starts the data rate is the same.

 

INFRA (http://10.144.162.8:8010/) (internal_7.6.22-rc1)

D20200211-073656

~343 minutes to read the 263,753 individual files amongst 834 directories before starting the data transfer.

~34 minutes to transfer the data onto the INFRA 50 way THOR.

Total time taken 6:16:13, Average transfer 1174KB/sec

 

 

DEV (http://10.144.162.1:8010 ) (internal_6.4.26-1)

D20200210-094424

~4 minutes to read the 263,753 individual files amongst 834 directories before starting the data transfer.

~34 minutes to transfer the data onto the DEV 50 way THOR.

Total time taken 38m 29s, Average transfer 1152KB/sec

 

Our Logs from the two transfers

 

DEV LOG

09:44:24.375(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) FileSpray WUID=D20200210-094424

09:44:24.471(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:44:54.696(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:45:24.875(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:45:55.049(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:46:26.594(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:46:58.081(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:47:28.250(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:47:59.763(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=0 Time Spent=

09:48:31.285(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=1 Time Spent=

..

10:21:29.029(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=96 Time Spent=

10:21:59.606(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=97 Time Spent=

10:22:29.772(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=started Percent complete=99 Time Spent=

10:22:59.934(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryFS WUID=D20200210-094424 state=finished Percent complete=100 Time Spent=38m 29s

10:23:00.012(02/10) INFO  TRACE : (dvc44403|8343|main|mncr_filesprayer_large_dpf_dev_001) queryAndWaitFS WU=D20200210-094424 spray was 100% complete

 

 

INFRA LOG

07:36:56.245(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) FileSpray WUID=D20200211-073656

07:36:56.339(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) queryFS WUID=D20200211-073656 state=started Percent complete=0 Time Spent=

07:37:26.594(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) queryFS WUID=D20200211-073656 state=started Percent complete=0 Time Spent=..

..

13:19:27.337(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) queryFS WUID=D20200211-073656 state=started Percent complete=0 Time Spent=

13:19:57.507(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) queryFS WUID=D20200211-073656 state=started Percent complete=2 Time Spent=

..

13:53:08.572(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) queryFS WUID=D20200211-073656 state=started Percent complete=99 Time Spent=

13:53:38.739(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) queryFS WUID=D20200211-073656 state=finished Percent complete=100 Time Spent=6:16:13

13:53:38.817(02/11) INFO  TRACE : (dvc44403|26482|main|mncr_filesprayer_large_infra_dpf_dev_001) queryAndWaitFS WU=D20200211-073656 spray was 100% complete

 

 

Conclusion

None

Activity

Show:

Jacob Cobbett-Smith February 24, 2020 at 2:55 PM

can you take a look at this?

Jacob Cobbett-Smith February 24, 2020 at 2:54 PM

it looks like the changes in https://hpccsystems.atlassian.net/browse/HPCC-20530 / https://github.com/hpcc-systems/HPCC-Platform/commit/cd8d1c8d378cdda83562189f5d06e7b05a074d7f may have caused this slowdown (introduced in 7.0.0).

I think previously the individual file sizes were captured via the initial wild card directory capture, so were in effect free, whereas after this change, there is a remote call for each individual file to fetch the size (and possibly the timestamp too).

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Components

Assignee

Reporter

Priority

Fix versions

Affects versions

Created February 24, 2020 at 2:41 PM
Updated March 11, 2020 at 11:02 AM
Resolved March 11, 2020 at 11:02 AM