squeeze

A static site generator that can put the toothpaste back in the tube.
git clone https://git.stjo.hn/squeeze
Log | Files | Refs | README | LICENSE

commit a32e91c3f283037339858d02231cae17cbfd8d73
parent 6092d42ba117d417206d50f00f85b7639d665d4d
Author: St John Karp <contact@stjo.hn>
Date:   Wed, 20 Oct 2021 07:37:49 -0400

Fix hack to wait for all background jobs to complete

I think I finally nailed this bug. The `wait` command wasn't waiting
for background jobs to complete before generating the RSS, which
meant that the most recent article often wouldn't be included in
the feed. I suspect now the problem was a scope issue. The contents
of the while loop are in a different scope, so a `wait` command
outside the loop won't see background jobs started inside the loop.

The fix is to keep track of the index, determine when we're in the
last iteration, and then run `wait`. This lets us remove the ugly
hack that checks for running `sed` or `smartypants` commands.

Diffstat:
Msqueeze.sh | 37++++++++++++++++++-------------------
Munsqueeze.sh | 20++++++++++----------
2 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/squeeze.sh b/squeeze.sh @@ -61,10 +61,15 @@ rsync --archive --delete --verbose \ "$source_path/" "$output_path/" # Parse and create all the HTML files. -find "$source_path" -type f -name "*.md" $find_test | +markdown_files="$(find "$source_path" -type f -name "*.md" $find_test)" +line_count="$(echo "$markdown_files" | wc -l | tr -d -c '[:digit:]')" +index=0 + +echo "$markdown_files" | sed "s|$source_path/||" | while IFS= read -r file ; do echo "$file" + index="$(expr "$index" + 1)" # Determine if this file has any metadata at the start. # Metadata are in the format Key: value, so it's easy to detect. @@ -89,26 +94,20 @@ find "$source_path" -type f -name "*.md" $find_test | smartypants \ > "$output_path/${file%.md}.html" & - # Add the most recent process ID to the list. - proc_ids="$! $proc_ids" - # Pause while the number of created processes is greater than - # or equal to the max processes. We have to subtract one - # because the `ps` command always outputs a header that we - # don't want to count. - while [ "$(ps -p "${proc_ids%% }" | tail -n +2 | wc -l | tr -d -c '[:digit:]')" -ge "$max_processes" ] ; do - true - done + if [ "$index" -eq "$line_count" ] ; then + # Wait until all jobs have completed. + wait + else + # Add the most recent process ID to the list. + proc_ids="$! $proc_ids" + # Pause while the number of created processes is greater than + # or equal to the max processes. + while [ "$(ps -p "${proc_ids%% }" | tail -n +2 | wc -l | tr -d -c '[:digit:]')" -ge "$max_processes" ] ; do + true + done + fi done -# Wait until all jobs have completed. -wait -# The `wait` command doesn't seem to wait for all the running jobs. -# Maybe it's stopping after all `swipl` processes complete? -# This hack just checks to see if any sed or smartypants processes are running. -while [ "$(ps -o comm | grep -c -e '^sed$' -e '^smartypants$')" -gt 0 ]; do - sleep 1 -done - # Generate the RSS feed. mkdir -p "${feed_path%/*}" # Grep the date of each article. diff --git a/unsqueeze.sh b/unsqueeze.sh @@ -17,10 +17,15 @@ rsync --archive --delete --verbose \ "$output_path/" "$source_path/" # Parse and create all the Markdown files. -find "$output_path" -type f -name "*.html" | +html_files="$(find "$output_path" -type f -name "*.html")" +line_count="$(echo "$html_files" | wc -l | tr -d -c '[:digit:]')" +index=0 + +echo "$html_files" | sed "s|$output_path/||" | while IFS= read -r file ; do echo "$file" + index="$(expr "$index" + 1)" swipl --traditional --quiet -l parse_entry.pl -g "consult('$site_path/site.pl'), parse_entry('$output_path/$file')." | # Unsmarten the punctuation. @@ -38,13 +43,8 @@ find "$output_path" -type f -name "*.html" | sed 's/&ldquo;/"/g' | sed 's/&quot;/"/g' \ > "$source_path/${file%.html}.md" & - done -# Wait until all jobs have completed. -wait -# The `wait` command doesn't seem to wait for all the running jobs. -# Maybe it's stopping after all `swipl` processes complete? -# This hack just checks to see if any sed processes are running. -while [ "$(ps -o comm | grep -c '^sed$')" -gt 0 ]; do - sleep 1 -done + # Wait until all jobs have completed. + [ "$index" -eq "$line_count" ] && + wait + done