At a client we have a huge directory of files. I wanted to list the first few files. ls -l | head
took ages as it first lists all the files and only then cuts it down.
After my first attempts in Python failed I wrote a Perl one-liner to list the first elements of a huge directory. However I wanted to see if I can do it with Python in some other way.
using iterdir of pathlib
The original attempt in Python was using the iterdir method of pathlib.
import pathlib
path = pathlib.Path("/home/gabor/work/code-maven.com/sites/en/pages/")
count = 0
for thing in path.iterdir():
count += 1
print(thing)
if count > 3:
break
On the real data it took 47 minutes to run.
using walk of os
The second attempt was to use the walk method of os.
import os
path = "/home/gabor/work/code-maven.com/sites/en/pages/"
count = 0
for dirname, dirs, files in os.walk(path):
for filename in files:
print(os.path.join(dirname, filename))
count += 1
if count > 3:
exit()
I don't know how long this would take. I stopped it after a minute.
using scandir of os
Finally I found the scandir method of os. That did the trick:
import os
path = "/home/gabor/work/code-maven.com/sites/en/pages/"
count = 0
with os.scandir(path) as it:
for entry in it:
print(entry.name)
count += 1
if count > 3:
exit()