File Operations

load_json(json_file, **kwargs)

Open and load data from a JSON file

reusables.load_json("example.json")
# {u'key_1': u'val_1', u'key_for_dict': {u'sub_dict_key': 8}}
Parameters:
  • json_file – Path to JSON file as string
  • kwargs – Additional arguments for the json.load command
Returns:

Dictionary

list_to_csv(my_list, csv_file)

Save a matrix (list of lists) to a file as a CSV

my_list = [["Name", "Location"],
           ["Chris", "South Pole"],
           ["Harry", "Depth of Winter"],
           ["Bob", "Skull"]]

reusables.list_to_csv(my_list, "example.csv")

example.csv

Parameters:
  • my_list – list of lists to save to CSV
  • csv_file – File to save data to
save_json(data, json_file, indent=4, **kwargs)

Takes a dictionary and saves it to a file as JSON

my_dict = {"key_1": "val_1",
           "key_for_dict": {"sub_dict_key": 8}}

reusables.save_json(my_dict,"example.json")

example.json

{
    "key_1": "val_1",
    "key_for_dict": {
        "sub_dict_key": 8
    }
}
Parameters:
  • data – dictionary to save as JSON
  • json_file – Path to save file location as str
  • indent – Format the JSON file with so many numbers of spaces
  • kwargs – Additional arguments for the json.dump command
csv_to_list(csv_file)

Open and transform a CSV file into a matrix (list of lists).

reusables.csv_to_list("example.csv")
# [['Name', 'Location'],
#  ['Chris', 'South Pole'],
#  ['Harry', 'Depth of Winter'],
#  ['Bob', 'Skull']]
Parameters:csv_file – Path to CSV file as str
Returns:list
extract(archive_file, path='.', delete_on_success=False, enable_rar=False)

Automatically detect archive type and extract all files to specified path.

import os

os.listdir(".")
# ['test_structure.zip']

reusables.extract("test_structure.zip")

os.listdir(".")
# [ 'test_structure', 'test_structure.zip']
Parameters:
  • archive_file – path to file to extract
  • path – location to extract to
  • delete_on_success – Will delete the original archive if set to True
  • enable_rar – include the rarfile import and extract
Returns:

path to extracted files

archive(files_to_archive, name='archive.zip', archive_type=None, overwrite=False, store=False, depth=None, err_non_exist=True, allow_zip_64=True, **tarfile_kwargs)

Archive a list of files (or files inside a folder), can chose between

  • zip
  • tar
  • gz (tar.gz, tgz)
  • bz2 (tar.bz2)
reusables.archive(['reusables', '.travis.yml'],
                      name="my_archive.bz2")
# 'C:\Users\Me\Reusables\my_archive.bz2'
Parameters:
  • files_to_archive – list of files and folders to archive
  • name – path and name of archive file
  • archive_type – auto-detects unless specified
  • overwrite – overwrite if archive exists
  • store – zipfile only, True will not compress files
  • depth – specify max depth for folders
  • err_non_exist – raise error if provided file does not exist
  • allow_zip_64 – must be enabled for zip files larger than 2GB
  • tarfile_kwargs – extra args to pass to tarfile.open
Returns:

path to created archive

config_dict(config_file=None, auto_find=False, verify=True, **cfg_options)

Return configuration options as dictionary. Accepts either a single config file or a list of files. Auto find will search for all .cfg, .config and .ini in the execution directory and package root (unsafe but handy).

reusables.config_dict(os.path.join("test", "data", "test_config.ini"))
# {'General': {'example': 'A regular string'},
#  'Section 2': {'anint': '234',
#                'examplelist': '234,123,234,543',
#                'floatly': '4.4',
#                'my_bool': 'yes'}}
Parameters:
  • config_file – path or paths to the files location
  • auto_find – look for a config type file at this location or below
  • verify – make sure the file exists before trying to read
  • cfg_options – options to pass to the parser
Returns:

dictionary of the config files

config_namespace(config_file=None, auto_find=False, verify=True, **cfg_options)

Return configuration options as a Namespace.

reusables.config_namespace(os.path.join("test", "data",
                                        "test_config.ini"))
# <Namespace: {'General': {'example': 'A regul...>
Parameters:
  • config_file – path or paths to the files location
  • auto_find – look for a config type file at this location or below
  • verify – make sure the file exists before trying to read
  • cfg_options – options to pass to the parser
Returns:

Namespace of the config files

os_tree(directory, enable_scandir=False)

Return a directories contents as a dictionary hierarchy.

reusables.os_tree(".")
# {'doc': {'build': {'doctrees': {},
#                   'html': {'_sources': {}, '_static': {}}},
#         'source': {}},
#  'reusables': {'__pycache__': {}},
#  'test': {'__pycache__': {}, 'data': {}}}
Parameters:
  • directory – path to directory to created the tree of.
  • enable_scandir – on python < 3.5 enable external scandir package
Returns:

dictionary of the directory

check_filename(filename)

Returns a boolean stating if the filename is safe to use or not. Note that this does not test for “legal” names accepted, but a more restricted set of: Letters, numbers, spaces, hyphens, underscores and periods.

Parameters:filename – name of a file as a string
Returns:boolean if it is a safe file name
count_files(*args, **kwargs)

Returns an integer of all files found using find_files

directory_duplicates(directory, hash_type='md5', **kwargs)

Find all duplicates in a directory. Will return a list, in that list are lists of duplicate files.

Parameters:
  • directory – Directory to search
  • hash_type – Type of hash to perform
  • kwargs – Arguments to pass to find_files to narrow file types
Returns:

list of lists of dups

dup_finder(file_path, directory='.', enable_scandir=False)

Check a directory for duplicates of the specified file. This is meant for a single file only, for checking a directory for dups, use directory_duplicates.

This is designed to be as fast as possible by doing lighter checks before progressing to more extensive ones, in order they are:

  1. File size
  2. First twenty bytes
  3. Full SHA256 compare
list(reusables.dup_finder(
     "test_structure\files_2\empty_file"))
# ['C:\Reusables\test\data\fake_dir',
#  'C:\Reusables\test\data\test_structure\Files\empty_file_1',
#  'C:\Reusables\test\data\test_structure\Files\empty_file_2',
#  'C:\Reusables\test\data\test_structure\files_2\empty_file']
Parameters:
  • file_path – Path to file to check for duplicates of
  • directory – Directory to dig recursively into to look for duplicates
  • enable_scandir – on python < 3.5 enable external scandir package
Returns:

generators

file_hash(path, hash_type='md5', block_size=65536, hex_digest=True)

Hash a given file with md5, or any other and return the hex digest. You can run hashlib.algorithms_available to see which are available on your system unless you have an archaic python version, you poor soul).

This function is designed to be non memory intensive.

reusables.file_hash(test_structure.zip")
# '61e387de305201a2c915a4f4277d6663'
Parameters:
  • path – location of the file to hash
  • hash_type – string name of the hash to use
  • block_size – amount of bytes to add to hasher at a time
  • hex_digest – returned as hexdigest, false will return digest
Returns:

file’s hash

find_files(directory='.', ext=None, name=None, match_case=False, disable_glob=False, depth=None, abspath=False, enable_scandir=False)

Walk through a file directory and return an iterator of files that match requirements. Will autodetect if name has glob as magic characters.

Note: For the example below, you can use find_files_list to return as a list, this is simply an easy way to show the output.

list(reusables.find_files(name="ex", match_case=True))
# ['C:\example.pdf',
#  'C:\My_exam_score.txt']

list(reusables.find_files(name="*free*"))
# ['C:\my_stuff\Freedom_fight.pdf']

list(reusables.find_files(ext=".pdf"))
# ['C:\Example.pdf',
#  'C:\how_to_program.pdf',
#  'C:\Hunks_and_Chicks.pdf']

list(reusables.find_files(name="*chris*"))
# ['C:\Christmas_card.docx',
#  'C:\chris_stuff.zip']
Parameters:
  • directory – Top location to recursively search for matching files
  • ext – Extensions of the file you are looking for
  • name – Part of the file name
  • match_case – If name or ext has to be a direct match or not
  • disable_glob – Do not look for globable names or use glob magic check
  • depth – How many directories down to search
  • abspath – Return files with their absolute paths
  • enable_scandir – on python < 3.5 enable external scandir package
Returns:

generator of all files in the specified directory

find_files_list(*args, **kwargs)

Returns a list of find_files generator

join_here(*paths, **kwargs)

Join any path or paths as a sub directory of the current file’s directory.

reusables.join_here("Makefile")
# 'C:\Reusables\Makefile'
Parameters:
  • paths – paths to join together
  • kwargs – ‘strict’, do not strip os.sep
  • kwargs – ‘safe’, make them into a safe path it True
Returns:

abspath as string

join_paths(*paths, **kwargs)

Join multiple paths together and return the absolute path of them. If ‘safe’ is specified, this function will ‘clean’ the path with the ‘safe_path’ function. This will clean root decelerations from the path after the first item.

Would like to do ‘safe=False’ instead of ‘**kwargs’ but stupider versions of python cough 2.6 don’t like that after ‘*paths’.

Parameters:
  • paths – paths to join together
  • kwargs – ‘safe’, make them into a safe path it True
Returns:

abspath as string

remove_empty_directories(root_directory, dry_run=False, ignore_errors=True, enable_scandir=False)

Remove all empty folders from a path. Returns list of empty directories.

Parameters:
  • root_directory – base directory to start at
  • dry_run – just return a list of what would be removed
  • ignore_errors – Permissions are a pain, just ignore if you blocked
  • enable_scandir – on python < 3.5 enable external scandir package
Returns:

list of removed directories

remove_empty_files(root_directory, dry_run=False, ignore_errors=True, enable_scandir=False)

Remove all empty files from a path. Returns list of the empty files removed.

Parameters:
  • root_directory – base directory to start at
  • dry_run – just return a list of what would be removed
  • ignore_errors – Permissions are a pain, just ignore if you blocked
  • enable_scandir – on python < 3.5 enable external scandir package
Returns:

list of removed files

safe_filename(filename, replacement='_')

Replace unsafe filename characters with underscores. Note that this does not test for “legal” names accepted, but a more restricted set of: Letters, numbers, spaces, hyphens, underscores and periods.

Parameters:
  • filename – name of a file as a string
  • replacement – character to use as a replacement of bad characters
Returns:

safe filename string

safe_path(path, replacement='_')

Replace unsafe path characters with underscores. Do NOT use this with existing paths that cannot be modified, this to to help generate new, clean paths.

Supports windows and *nix systems.

Parameters:
  • path – path as a string
  • replacement – character to use in place of bad characters
Returns:

a safer path

touch(path)

Native ‘touch’ functionality in python

Parameters:path – path to file to ‘touch’