Revisions to What happens if I start a bash array with a big index?

deleted 3 characters in body

edited Aug 11, 2023 at 20:29

587.8k
96
1.1k
1.7k

Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells64x64 large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here) without allocating more memory than that is needed to store the elements that are actually set.

Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here) without allocating more memory than that is needed to store the elements that are actually set.

Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64x64 large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here) without allocating more memory than that is needed to store the elements that are actually set.

added 71 characters in body

Source Link

edited Aug 11, 2023 at 17:30

Stéphane Chazelas

587.8k
96
1.1k
1.7k

In a[$i$j]="something", the $i and $j variables are expanded, so with i=0 j=1, that becomes a[01]="something", 01 as an arithmetic expression means octal number 1 in bash. With i=0 j=10, that would be a[010]="something" same as a[8]="something". And you'd get a[110]="something" for both x=11 y=0 and x=1 y=10.

InNow to answer the question in the subject, in bash like in ksh, plain arrays are sparse, which means you can have a[n] defined without a[0] to a[n-1] being defined, so in that sense they're are not like the arrays of C or most other languages or shells.

Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here) without allocating more memory than that is needed to store the elements that are actually set.

And likeIn all other shells with array support ((t)csh, zsh, rc, es, fish...), array indices start at 1 instead of 0 and arrays are normal non-sparse arrays where you can't have a[2] set without a[1] also set even if that's to the empty string.

Like in manymost programming languages, associative arrays in bash are implemented as hash tables with no notion of order or rank (you'll notice typeset -p shows them in seemingly random order above).

In a[$i$j]="something", the $i and $j variables are expanded, so with i=0 j=1, that becomes a[01]="something", 01 as an arithmetic expression means octal number 1 in bash. With i=0 j=10, that would be a[010]="something" same as a[8]="something".

In bash like in ksh, plain arrays are sparse, which means you can have a[n] defined without a[0] to a[n-1] being defined, so in that sense they're are not like the arrays of C or most other languages or shells. Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here).

And like in many programming languages, associative arrays are implemented as hash tables with no notion of order or rank (you'll notice typeset -p shows them in seemingly random order above).

In a[$i$j]="something", the $i and $j variables are expanded, so with i=0 j=1, that becomes a[01]="something", 01 as an arithmetic expression means octal number 1 in bash. With i=0 j=10, that would be a[010]="something" same as a[8]="something". And you'd get a[110]="something" for both x=11 y=0 and x=1 y=10.

Now to answer the question in the subject, in bash like in ksh, plain arrays are sparse, which means you can have a[n] defined without a[0] to a[n-1] being defined, so in that sense they're are not like the arrays of C or most other languages or shells.

Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here) without allocating more memory than that is needed to store the elements that are actually set.

In all other shells with array support ((t)csh, zsh, rc, es, fish...), array indices start at 1 instead of 0 and arrays are normal non-sparse arrays where you can't have a[2] set without a[1] also set even if that's to the empty string.

Like in most programming languages, associative arrays in bash are implemented as hash tables with no notion of order or rank (you'll notice typeset -p shows them in seemingly random order above).

added 211 characters in body

Source Link

edited Aug 11, 2023 at 16:50

Stéphane Chazelas

587.8k
96
1.1k
1.7k

In bash like in ksh, plain arrays are sparse, which means you can have a[n] defined without a[0] to a[n-1] being defined, so in that sense they're are not like the arrays of C or most other languages or shells. Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 worth of 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here).

And like in many programming languages, associative arrays are implemented as hash tables with no notion of order or rank (you'll notice typeset -p shows them in seemingly random order above).

For more details on array design in different shells, see this answer to Test for array support by shell.

In bash like in ksh, plain arrays are sparse, which means you can have a[n] defined without a[0] to a[n-1] being defined, so in that sense they're are not like the arrays of C or most other languages or shells. Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 worth of 64bit pointers and some overhead, while that doesn't seem to happen in bash

And like in many programming languages, associative arrays are implemented as hash tables with no notion of order or rank (you'll notice typeset -p shows them in seemingly random order above).

In bash like in ksh, plain arrays are sparse, which means you can have a[n] defined without a[0] to a[n-1] being defined, so in that sense they're are not like the arrays of C or most other languages or shells. Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64 cells large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here).

And like in many programming languages, associative arrays are implemented as hash tables with no notion of order or rank (you'll notice typeset -p shows them in seemingly random order above).

For more details on array design in different shells, see this answer to Test for array support by shell.