How to remove ^M characters from a file with Python
Use the following Python script to remove ^M
(carriage return) characters from your file and replace them with newline characters only. To do this in Emacs, see my notes here.
remove_ctrl_m_chars.py:
import os
import sys
import tempfile
def main():
filename = sys.argv[1]
with tempfile.NamedTemporaryFile(delete=False) as fh:
for line in open(filename):
line = line.rstrip()
fh.write(line + '\n')
os.rename(filename, filename + '.bak')
os.rename(fh.name, filename)
if __name__ == '__main__':
main()
Run it
$ python remove_ctrl_m_chars.py myfile.txt
Documentation
Related posts
- How to get the filename and it's parent directory in Python — posted 2011-12-28
- Options for listing the files in a directory with Python — posted 2010-04-19
- Monitoring a filesystem with Python and Pyinotify — posted 2010-04-09
- os.path.relpath() source code for Python 2.5 — posted 2010-03-31
- A hack to copy files between two remote hosts using Python — posted 2010-02-08
Comments
Why would you need this as a utility when you have dos2unix, whose entire purpose is for this?
Harsh: Because I have a bad case of NIH and because I wasn't familiar with dos2unix. Thanks for the tip; I'll use dos2unix next time.
Wonderful little script :)
or use vi editor's command
:%s/\r//g
This is so great! Such a time saver!. Thank you for posing this.
I ran it from withing a putty session with ./ instead of the $ python
In vim, when I see stray ^M character, I: Save changes. Edit file again, using "dos file format" -- with the "mixed" files I get that have a random mixture of both, this does the Right Thing: it interprets each CRLF as one newline, and also interprets any other LF as one newline. Set the current file format to unix. Save changes (and because the current format is "unix", write only a LF for each newline).
In other words, I type:
:w
:e ++ff=dos
:set ff=unix
:w
There's also a way to script vim so it "cleans" all the source files and other text files and in an entire directory.
"Change end-of-line format for dos-mac-unix"
Do note that removing line endings is not the only thing dos2unix does! Notably, it also removes byte order marks.
disqus:2730975098