Looping over files in a directory is a basic ETL task. In this tutorial, I’m going to introduce you to the syntax. If you downloaded the code from GitHub, there will be small sample files to work with.
In later lessons, you will see how it is done with live files.
Example #1: Loop Over Everything In Folder
import os
script_dir = os.getcwd()
data_directory = 'data\\'
example_directory = 'FileLoopExample\\'
path = os.path.join(script_dir,data_directory,example_directory)
for filename in os.listdir(path):
print(filename)
Example #2: Loop Over Files With A Specific File Extention
import os
script_dir = os.getcwd()
data_directory = 'data\\'
example_directory = 'FileLoopExample\\'
path = os.path.join(script_dir,data_directory,example_directory)
for filename in os.listdir(path):
if filename.endswith('.csv'):
print(filename)
Example #3: Loop Over Files In Subdirectories Recursively
import os
script_dir = os.getcwd()
data_directory = 'data\\'
example_directory = 'FileLoopExample\\'
path = os.path.join(script_dir,data_directory,example_directory)
for subdir, dirs, files in os.walk(path):
for filename in files:
print(filename)
Copyright © 2020, Mass Street Analytics, LLC. All Rights Reserved.