utf-8-py: a script that fixes ownCloud non-UTF8 filenames issues

The utf8.py script

Some months ago I had to face an annoying issue that affected the ownCloud client during the folder-synchronization process. As a result of that I wrote a trivial python script that helped me fix rename the non-UTF8 filenames using the UTF-8 encoding. Today I had to deal with the very same issue, so I decided to add some functionality to the original script I wrote.


This script has been written in Python 2.7. This is what you will need in order to execute the script:

  • Python 2.7.
  • The conmv utility. (# apt-get install convmv).
  • The Python Chardet module (# apt-get install python-chardet).
  • The script itself, utf8.py.

Using the script

./utf8.py -d PATH [-t THRESHOLD][-l LOG][-r ]

-d PATH:

The directory to analyse and, if the -r flag is given, to fix (i.e., all the files and directories inside the PATH directory will be renamed according to the UTF-8 encoding standard).


The Chardet module has a value called “confidence”.  This value offers a quantized factor for any particular detected charset. By using the -t flag, one can set the minimal value for confidence that a particular detected charset must match before attempting to rename the file or directory using UTF-8. This is a numerical value in the range [0..1]. Default value: 0.8.

-l LOG

By default, the script will create a logfile in the same directory where it is executed called utf8-log.txt. Passing this flag, one can choose where the logfile should be and its name.


By default, the execution of the utf8.py script is a dry-run; i.e., the files and directoris of PATH will not be renamed. Therefore, by passing the script this flag, the files and directories inside PATH will be renamed.


This command will generate a log file under /tmp/analysis.log for the directory /home/data, detecting any non-UTF8 charset with a default confidence of 0.8. No file or directory renaming will take place, so the directory /home/data will remain unchanged:

./utf8.py -d /home/data -l /tmp/analysis.log

This command will rename any file and directory under /home/data that has a minimal value of 0.95 for confidence, the rest will not be renamed:

./utf8.py -d /home/data -l /tmp/renamed.log -t 0.95 -r

Download the script

You can get the latest version for this script right here.