Python code obfuscator
To just skip to the correct answer, click here.
Sometimes we might want to know: what is the total size in bytes (or kilobytes or megabytes) of our actual source code in some project? For example, of all the .py files in a directory tree?
Many popular Q&A sites for programmers will suggest using the Unix tool du. Unfortunately, popular Q&A sites for programmers are stuffed with wrong answers that calculate incorrect numbers.
du
counts file types other than your source code file type(s), whether or not you wanted to include those.du
counts storage space used by the file system, which usually counts the storage media space used by directory structures themselves, not just your source code.du
to round up during its calculations.du has several flags we can use to help change the calculation to what we really want.
--apparent-size
flag gives the length of your file’s contents, without considering the underlying storage’s block size. (The du
man page has more information on special cases.)-b
flag gives an answer in bytes.-c
flag gives a grand total.-a
flag is useful for not failing to count some files. (The man file might be unclear, but this flag can sometimes affect whether some subdirectory contents are left out.)This gets us partway to the answer, letting us total the size of all files in a directory (sub)tree.
Below is a heavily commented script that completes our calculations by totalling the size of all contents, and then subtracting the size of all contents that are not the file type(s) that interest us.
#!/bin/bash
#
# Copyright 2016-2017 by Chris Niswander.
#
# Example script calculates the total size (in bytes) of the files
# having (a) specified type(s)/extension(s),
# in the directory tree specified by command line argument.
#
# See also: http://bitboost.com/python-obfuscator/articles_and_resources/how-to-calculate-the-total-size-length-of-your-code-or-codebase-in-bytes-fixing-inaccuracies-problems-in-du--an-easy-way
#
# The --exclude option can be used multiple times in the same command line
# to exclude multiple file types.
# e.g. $ du --apparent-size -ac -b delme --exclude=*.txt --exclude=*.ps
#
# These 'excluded' file types
# will be the only types we actually count the size of,
# when we run du twice and subtract one result from the other.
du_flags_excluding=" --exclude=*.py "
#
echo "measuring size of code in directory structure:" "$1"
echo "file type(s)/extension(s) measured are those 'excluded' from 2nd measurement:" $du_flags_excluding
dirpath="$1"
du_flags=" --apparent-size -b -ca "
#
# Total the bytes for all files and directories in argument directory:
read -a du_result_inclusive < <(du $du_flags $dirpath | tail -n 1)
#
# Total the bytes for all files and directories in argument directory,
# *except* the file type(s) specified by $du_flags_excluding:
read -a du_result_excluding < <(du $du_flags $du_flags_excluding $dirpath | tail -n 1)
#
echo "du result inclusive:" $du_result_inclusive
echo "du result excluding specified file type/extension:" $du_result_excluding
#
# Subtract to get the total size (in bytes) of the file type(s) of interest.
echo $((du_result_inclusive - du_result_excluding))
du
is included in most Unix (e.g. Linux and OS X) installations as a command line utility.
If you’re on Windows and want to run this script, you can either:
du
.© Copyright 2018 by BitBoost
BitBoost, BitBoost Systems, and 'bobs' are trademarks and/or service marks of BitBoost