Skip to main content
Became Hot Network Question
formatting
Source Link
Juergen
  • 764
  • 6
  • 16

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):
 

time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null 1.461s time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null 3.069s 

The real use case is as follows (workarond without time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null
1.461s
time lz4 line-dmc /var/lib/apt/lists/*lz4 > /dev/nullbuffer):

time lz4 -dmc /var/lib/apt/lists/*Contents* | grep -F $'/parallel\t' | sort -u usr/bin/parallel universe/utils/moreutils,universe/utils/parallel usr/bin/parallel universe/utils/parallel usr/lib/R/library/parallel/R/parallel universe/math/r-base-core usr/lib/cups/backend/parallel net/cups-filters usr/share/doc-base/parallel universe/utils/parallel real 0m5.349s user 0m3.970s sys 0m5.839s time ls -S /var/lib/apt/lists/*Contents* | parallel lz4 -dc '{}' \| grep -F "\$'/parallel\t'" | sort -u (same output as above) real 0m3.669s user 0m5.888s sys 0m7.676s 

This parallelizes not only the decompression but also the postprocessing and is the better solution here where the work is not 99 % in the first part of the pipe.
3.069sBut this approach to parallelize the complete pipe is not always possible, so the general question remains open for cases where output of first step is not very small and thus streaming is wanted.

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):
 time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null
1.461s
time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null
3.069s

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):

time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null 1.461s time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null 3.069s 

The real use case is as follows (workarond without --line-buffer):

time lz4 -dmc /var/lib/apt/lists/*Contents* | grep -F $'/parallel\t' | sort -u usr/bin/parallel universe/utils/moreutils,universe/utils/parallel usr/bin/parallel universe/utils/parallel usr/lib/R/library/parallel/R/parallel universe/math/r-base-core usr/lib/cups/backend/parallel net/cups-filters usr/share/doc-base/parallel universe/utils/parallel real 0m5.349s user 0m3.970s sys 0m5.839s time ls -S /var/lib/apt/lists/*Contents* | parallel lz4 -dc '{}' \| grep -F "\$'/parallel\t'" | sort -u (same output as above) real 0m3.669s user 0m5.888s sys 0m7.676s 

This parallelizes not only the decompression but also the postprocessing and is the better solution here where the work is not 99 % in the first part of the pipe.
But this approach to parallelize the complete pipe is not always possible, so the general question remains open for cases where output of first step is not very small and thus streaming is wanted.

clarification
Source Link
Juergen
  • 764
  • 6
  • 16

The default output mode of GNU parallel is --group:
The output of each job is written to a temporary file and passed to the output of parallel only after the job has finished.

When using this default output mode on data larger than the /tmp space like in
parallel lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc it is slow and crashes with
parallel: Error: Output is incomplete.
Cannot append to buffer file in /tmp.

When using the --ungroup mode, lines are split in the middle which causes a different output from
parallel --ungroup lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc
than from unparallelized lz4 -dmc /var/lib/apt/lists/*lz4 | wc.

According to the parallel manpage, this should be solved by the --line-buffer option as I understand it: All jobs have an output pipe which is read by parallel and if output becomes available from any job, it gets passed line by line to the output pipe of the parallel process itself. (Edit: I mean lines in blocks as is done for spreading large input to the parallel processes, not one syscall per line which would be far too slow.)

But this doesn't work:
parallel --line-buffer lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc -c
results in the same disk full error as with the implied --group above.

How to use parallel --line-buffer without temporary files ?

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):
time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null
1.461s
time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null
3.069s

The default output mode of GNU parallel is --group:
The output of each job is written to a temporary file and passed to the output of parallel only after the job has finished.

When using this default output mode on data larger than the /tmp space like in
parallel lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc it is slow and crashes with
parallel: Error: Output is incomplete.
Cannot append to buffer file in /tmp.

When using the --ungroup mode, lines are split in the middle which causes a different output from
parallel --ungroup lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc
than from unparallelized lz4 -dmc /var/lib/apt/lists/*lz4 | wc.

According to the parallel manpage, this should be solved by the --line-buffer option as I understand it: All jobs have an output pipe which is read by parallel and if output becomes available from any job, it gets passed line by line to the output pipe of the parallel process itself.

But this doesn't work:
parallel --line-buffer lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc -c
results in the same disk full error as with the implied --group above.

How to use parallel --line-buffer without temporary files ?

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):
time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null
1.461s
time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null
3.069s

The default output mode of GNU parallel is --group:
The output of each job is written to a temporary file and passed to the output of parallel only after the job has finished.

When using this default output mode on data larger than the /tmp space like in
parallel lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc it is slow and crashes with
parallel: Error: Output is incomplete.
Cannot append to buffer file in /tmp.

When using the --ungroup mode, lines are split in the middle which causes a different output from
parallel --ungroup lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc
than from unparallelized lz4 -dmc /var/lib/apt/lists/*lz4 | wc.

According to the parallel manpage, this should be solved by the --line-buffer option as I understand it: All jobs have an output pipe which is read by parallel and if output becomes available from any job, it gets passed line by line to the output pipe of the parallel process itself. (Edit: I mean lines in blocks as is done for spreading large input to the parallel processes, not one syscall per line which would be far too slow.)

But this doesn't work:
parallel --line-buffer lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc -c
results in the same disk full error as with the implied --group above.

How to use parallel --line-buffer without temporary files ?

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):
time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null
1.461s
time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null
3.069s

added raw decompression performance
Source Link
Juergen
  • 764
  • 6
  • 16

The default output mode of GNU parallel is --group:
The output of each job is written to a temporary file and passed to the output of parallel only after the job has finished.

When using this default output mode on data larger than the /tmp space like in
parallel lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc it is slow and crashes with
parallel: Error: Output is incomplete.
Cannot append to buffer file in /tmp.

When using the --ungroup mode, lines are split in the middle which causes a different output from
parallel --ungroup lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc
than from unparallelized lz4 -dmc /var/lib/apt/lists/*lz4 | wc.

According to the parallel manpage, this should be solved by the --line-buffer option as I understand it: All jobs have an output pipe which is read by parallel and if output becomes available from any job, it gets passed line by line to the output pipe of the parallel process itself.

But this doesn't work:
parallel --line-buffer lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc -c
results in the same disk full error as with the implied --group above.

How to use parallel --line-buffer without temporary files ?

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):
time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null
1.461s
time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null
3.069s

The default output mode of GNU parallel is --group:
The output of each job is written to a temporary file and passed to the output of parallel only after the job has finished.

When using this default output mode on data larger than the /tmp space like in
parallel lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc it is slow and crashes with
parallel: Error: Output is incomplete.
Cannot append to buffer file in /tmp.

When using the --ungroup mode, lines are split in the middle which causes a different output from
parallel --ungroup lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc
than from unparallelized lz4 -dmc /var/lib/apt/lists/*lz4 | wc.

According to the parallel manpage, this should be solved by the --line-buffer option as I understand it: All jobs have an output pipe which is read by parallel and if output becomes available from any job, it gets passed line by line to the output pipe of the parallel process itself.

But this doesn't work:
parallel --line-buffer lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc -c
results in the same disk full error as with the implied --group above.

How to use parallel --line-buffer without temporary files ?

System is LUbuntu 20 LTS. parallel -V returns 20161222.

The default output mode of GNU parallel is --group:
The output of each job is written to a temporary file and passed to the output of parallel only after the job has finished.

When using this default output mode on data larger than the /tmp space like in
parallel lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc it is slow and crashes with
parallel: Error: Output is incomplete.
Cannot append to buffer file in /tmp.

When using the --ungroup mode, lines are split in the middle which causes a different output from
parallel --ungroup lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc
than from unparallelized lz4 -dmc /var/lib/apt/lists/*lz4 | wc.

According to the parallel manpage, this should be solved by the --line-buffer option as I understand it: All jobs have an output pipe which is read by parallel and if output becomes available from any job, it gets passed line by line to the output pipe of the parallel process itself.

But this doesn't work:
parallel --line-buffer lz4 -dc ::: /var/lib/apt/lists/*lz4 | wc -c
results in the same disk full error as with the implied --group above.

How to use parallel --line-buffer without temporary files ?

System is LUbuntu 20 LTS. parallel -V returns 20161222. Comparision of raw serial and parallel decompression performance on Dual-Core i3-4130 with hyper threading (4 threads):
time ls -S /var/lib/apt/lists/*lz4 | parallel --ungroup lz4 -dc > /dev/null
1.461s
time lz4 -dmc /var/lib/apt/lists/*lz4 > /dev/null
3.069s

version
Source Link
Juergen
  • 764
  • 6
  • 16
Loading
formatting
Source Link
Juergen
  • 764
  • 6
  • 16
Loading
Source Link
Juergen
  • 764
  • 6
  • 16
Loading