6

My system creates a new text file every time a certain event occurs.
The files should be named file_a.txt file_b.txt file_c.txt etc.

In a Bash shell script,how to find out what filename should be used next?

For instance, if file_a.txt and file_b.txt exist but not file_c.txt, then the next available filename is file_c.txt.

This could be a number if it is easier.
I started designing an algorithm but there is probably an easier way?

Note: Files get removed each day, so the probability of reaching z is zero. So, after z any strategy is acceptable: aa, using integers, or even using UUIDs.

1
  • 1
    Whats the pattern for the file naming, just the next letter in the alphabet? What happens when it reaches z Commented Jun 22, 2015 at 12:01

5 Answers 5

1

Here's a crude way (no error checking) to do it purely in bash:

#helper function to convert a number to the corresponding character chr() { [ "$1" -lt 256 ] || return 1 printf "\\$(printf '%03o' "$1")" } #helper function to convert a character to the corresponding integer ord() { LC_CTYPE=C printf '%d' "'$1" } #increment file fn_incr(){ #first split the argument into its constituent parts local fn prefix letter_and_suffix letter suffix next_letter fn=$1 prefix=${fn%_*} letter_and_suffix=${fn#${prefix}_} letter=${letter_and_suffix%%.*} suffix=${letter_and_suffix#*.} #increment the letter part next_letter=$(chr $(($(ord "$letter") + 1))) #reassemble echo "${prefix}_${next_letter}.${suffix}" } 

Example usage:

fn_incr foo_bar_A.min.js #=> foo_bar_B.min.js 

Doing it in-bash with multiple-letter indices would require longer code. You could always do it in a different executable, but then you might want to increment the filenames in batches or else the executable startup overhead might slow down your program unacceptably. It all depends on your use case.

Using plain old integers might be the better choice here as you won't have to manually manage how 9++ overflows to the left.


chr() and ord() have been shamelessly stolen from Bash script to get ASCII values for alphabet

1

If you don't really care, on Linux (more precisely, with GNU coreutils):

tmpfile=$(TMPDIR=. mktemp --backup=numbered) … # create the content mv --backup=numbered -- "$tmpfile" file.txt 

This uses the GNU backup name scheme: file.txt, file.txt.~1~, file.txt.~2~, …

Another relatively compact way, with numbers that can be placed in a more convenient place, is to take advantage of zsh's glob qualifiers to find the latest file, and calculate the next file with some parameter expansion.

latest=(file_<->.txt(n[-1])) if ((#latest == 0)); then next=file_1.txt else latest=$latest[1] next=${${latest%.*}%%<->}$((${${latest%.*}##*[^0-9]}+1)).${latest##*.} fi mv -- $tmpfile $next 

With any POSIX shell, you'll have an easier time if you use a number with leading zeros. Take care that an integer literal with a leading zero is parsed as octal.

move_to_next () { shift $(($#-2)) case ${1%.*} in *\*) mv -- "$2" file_0001.txt;; *) set -- "${1%.*}" "${1##*.}" "$2" set -- "${1%_*}" "$((1${1##*_}+1)).$2" "$3";; mv -- "$3" "${1}_${2#1}";; esac } move_to_next file_[0-9]*.txt "$tmpfile" 
0

Try:

perl -le 'print $ARGV[-1] =~ s/[\da-zA-Z]+(?=\.)/++($i=$&)/er' file*.txt 

That will give you file_10.txt after file_9.txt, file_g.txt after file_f.txt, file_aa.txt after file_z.txt, but not file_ab.txt after file_aa.txt or file_11.txt after file_10.txt because the file* shell glob will sort file_z.txt after file_aa.txt and file_9.txt after file_10.txt.

That latter one you can work around with zsh by using file*.txt(n) instead of file*.txt.

Or you can define a numeric sort order in zsh, based on those aa, abc being recognised as numbers in base 36:

b36() REPLY=$((36#${${REPLY:r}#*_})) perl ... file_*.txt(no+b36) 

(note that the order is ...7, 8, 9, a/A, b/B..., z/Z, 10, 11... so you don't want to mix file_123.txt and file_aa.txt).

2
  • The perl on-liner looks great! It does not seem to work for the first file0.txt though? It creates file*.txt. Commented Jun 23, 2015 at 8:56
  • @NicolasRaoul, with a proper shell (zsh, Thomson shell, csh, tcsh, fish, bash -o failglob), that would rather give you a No match error. Commented Jun 23, 2015 at 9:38
0

This outputs the next sequential filename. The ID can be any length and it can be either numeric or alphabetic. This sample is primed to use an alpha ID, the first ID being a

pfix='file_' sfix='.txt' idbase=a # 1st alpha id when no files exist - use a decimal number for numeric id's idpatt='[a-z]' # alpha glob pattern - use '[0-9]' for numeric id's shopt -s extglob idhigh=$( ls -1 "$pfix"+($idpatt)"$sfix" 2>/dev/null | awk 'length>=l{ l=length; id=substr($0,'${#pfix}'+1,length-'${#pfix}-${#sfix}') } END{ print id }' ) [[ -z $idhigh ]] && echo "$pfix$idbase$sfix" || perl -E '$x="'$idhigh'"; $x++; print "'${pfix}'"."$x"."'${sfix}'\n"' 

If no matching file exists, the output is:

file_a.txt 

If the highest matching file is file_zzz.txt, the output is:

file_aaaa.txt 
0

This problem can be solved handily in python using various iterator building blocks available in the itertools module

from os.path import isfile from string import ascii_lowercase from itertools import dropwhile, imap, chain, product, repeat, count next(dropwhile(isfile, imap('file_{}.txt'.format, imap(''.join, chain.from_iterable( product(ascii_lowercase, repeat=x) for x in count(1)))))) 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.