ALGORITHM December 27, 2021

Shuffle strategic for limited resources

Words count 14k Reading time 12 mins.

On the early stage of Machine Learning, Data Mining progress, one of the problems we have to deal with processing large-size file, including corpus shuffle, usually its size would be larger than our limited resources like memory or capacity. Let’s say, the file is 30GB, whereas the provided memory is 8GB or 16GB, we surely cannot load entire them to...

