How to find the next available file suffix (file_a.txt file_b.txt etc)

Question

My system creates a new text file every time a certain event occurs.
The files should be named file_a.txt file_b.txt file_c.txt etc.

In a Bash shell script,how to find out what filename should be used next?

For instance, if file_a.txt and file_b.txt exist but not file_c.txt, then the next available filename is file_c.txt.

This could be a number if it is easier.
I started designing an algorithm but there is probably an easier way?

Note: Files get removed each day, so the probability of reaching z is zero. So, after z any strategy is acceptable: aa, using integers, or even using UUIDs.

Whats the pattern for the file naming, just the next letter in the alphabet? What happens when it reaches z — 123
– 123, Commented Jun 22, 2015 at 12:01

Community · Accepted Answer · 2017-04-13 12:36:52Z

Here's a crude way (no error checking) to do it purely in bash:

#helper function to convert a number to the corresponding character chr() { [ "$1" -lt 256 ] || return 1 printf "\\$(printf '%03o' "$1")" } #helper function to convert a character to the corresponding integer ord() { LC_CTYPE=C printf '%d' "'$1" } #increment file fn_incr(){ #first split the argument into its constituent parts local fn prefix letter_and_suffix letter suffix next_letter fn=$1 prefix=${fn%_*} letter_and_suffix=${fn#${prefix}_} letter=${letter_and_suffix%%.*} suffix=${letter_and_suffix#*.} #increment the letter part next_letter=$(chr $(($(ord "$letter") + 1))) #reassemble echo "${prefix}_${next_letter}.${suffix}" }

Example usage:

fn_incr foo_bar_A.min.js #=> foo_bar_B.min.js

Doing it in-bash with multiple-letter indices would require longer code. You could always do it in a different executable, but then you might want to increment the filenames in batches or else the executable startup overhead might slow down your program unacceptably. It all depends on your use case.

Using plain old integers might be the better choice here as you won't have to manually manage how 9++ overflows to the left.

chr() and ord() have been shamelessly stolen from Bash script to get ASCII values for alphabet

Gilles 'SO- stop being evil' · Accepted Answer · 2015-06-23 00:08:22Z

If you don't really care, on Linux (more precisely, with GNU coreutils):

tmpfile=$(TMPDIR=. mktemp --backup=numbered) … # create the content mv --backup=numbered -- "$tmpfile" file.txt

This uses the GNU backup name scheme: file.txt, file.txt.~1~, file.txt.~2~, …

Another relatively compact way, with numbers that can be placed in a more convenient place, is to take advantage of zsh's glob qualifiers to find the latest file, and calculate the next file with some parameter expansion.

latest=(file_<->.txt(n[-1])) if ((#latest == 0)); then next=file_1.txt else latest=$latest[1] next=${${latest%.*}%%<->}$((${${latest%.*}##*[^0-9]}+1)).${latest##*.} fi mv -- $tmpfile $next

With any POSIX shell, you'll have an easier time if you use a number with leading zeros. Take care that an integer literal with a leading zero is parsed as octal.

move_to_next () { shift $(($#-2)) case ${1%.*} in *\*) mv -- "$2" file_0001.txt;; *) set -- "${1%.*}" "${1##*.}" "$2" set -- "${1%_*}" "$((1${1##*_}+1)).$2" "$3";; mv -- "$3" "${1}_${2#1}";; esac } move_to_next file_[0-9]*.txt "$tmpfile"

Stéphane Chazelas · Accepted Answer · 2015-06-22 14:53:45Z

Try:

perl -le 'print $ARGV[-1] =~ s/[\da-zA-Z]+(?=\.)/++($i=$&)/er' file*.txt

That will give you file_10.txt after file_9.txt, file_g.txt after file_f.txt, file_aa.txt after file_z.txt, but not file_ab.txt after file_aa.txt or file_11.txt after file_10.txt because the file* shell glob will sort file_z.txt after file_aa.txt and file_9.txt after file_10.txt.

That latter one you can work around with zsh by using file*.txt(n) instead of file*.txt.

Or you can define a numeric sort order in zsh, based on those aa, abc being recognised as numbers in base 36:

b36() REPLY=$((36#${${REPLY:r}#*_})) perl ... file_*.txt(no+b36)

(note that the order is ...7, 8, 9, a/A, b/B..., z/Z, 10, 11... so you don't want to mix file_123.txt and file_aa.txt).

The perl on-liner looks great! It does not seem to work for the first file0.txt though? It creates file*.txt. — Nicolas Raoul
– Nicolas Raoul, Commented Jun 23, 2015 at 8:56
@NicolasRaoul, with a proper shell (zsh, Thomson shell, csh, tcsh, fish, bash -o failglob), that would rather give you a No match error. — Stéphane Chazelas
– Stéphane Chazelas, Commented Jun 23, 2015 at 9:38

Peter.O · Accepted Answer · 2015-06-23 11:44:56Z

This outputs the next sequential filename. The ID can be any length and it can be either numeric or alphabetic. This sample is primed to use an alpha ID, the first ID being a

pfix='file_' sfix='.txt' idbase=a # 1st alpha id when no files exist - use a decimal number for numeric id's idpatt='[a-z]' # alpha glob pattern - use '[0-9]' for numeric id's shopt -s extglob idhigh=$( ls -1 "$pfix"+($idpatt)"$sfix" 2>/dev/null | awk 'length>=l{ l=length; id=substr($0,'${#pfix}'+1,length-'${#pfix}-${#sfix}') } END{ print id }' ) [[ -z $idhigh ]] && echo "$pfix$idbase$sfix" || perl -E '$x="'$idhigh'"; $x++; print "'${pfix}'"."$x"."'${sfix}'\n"'

If no matching file exists, the output is:

file_a.txt

If the highest matching file is file_zzz.txt, the output is:

file_aaaa.txt

iruvar · Accepted Answer · 2015-06-23 20:29:08Z

This problem can be solved handily in python using various iterator building blocks available in the itertools module

from os.path import isfile from string import ascii_lowercase from itertools import dropwhile, imap, chain, product, repeat, count next(dropwhile(isfile, imap('file_{}.txt'.format, imap(''.join, chain.from_iterable( product(ascii_lowercase, repeat=x) for x in count(1))))))

Stack Exchange Network

How to find the next available file suffix (file_a.txt file_b.txt etc)

5 Answers 5

You must log in to answer this question.

Linked

Hot Network Questions

How to find the next available file suffix (file_a.txt file_b.txt etc)

5 Answers 5

You must log in to answer this question.

Linked

Related

Hot Network Questions