TIL: timeout in Bash scripts

343 lr0 93 5/26/2025, 11:34:28 AM heitorpb.github.io ↗

Comments (93)

majke · 2d ago
My fav little-known trick is to test various syscalls fail with strace fault injection, like:

  $ strace -e trace=clone -e fault=clone:error=EAGAIN

random link: https://medium.com/@manav503/using-strace-to-perform-fault-i...
jonhohle · 1d ago
This is incredible and something I’d wish I’d known about a long time ago. I’d often stub out functions like this knowing I couldn’t test the failure branch, but try to limit that to as small an area as possible.

Thanks!

eru · 1d ago
ycombinatrix · 1d ago
This is great. Anyone know of an equivalent in Windows?
dwattttt · 1d ago
Application Verifier provides fault injection, as well as detection for a bunch of conditions (https://learn.microsoft.com/en-us/windows-hardware/drivers/d...).

It's only intended for native/unmanaged code though.

colejohnson66 · 1d ago
The recommended approach for managed apps (meaning .NET) is probably dependency injection. If you want to test a component with fallible methods, write an interface wrapper and inject a fallible mock. The JIT will do dynamic PGO and devirtualize in normal application usage.
broken_broken_ · 1d ago
Dtrace can do the same and much more I believe with destructive actions, and it is supported on Windows.
cenamus · 1d ago
And OpenBSD? :D
0xbadcafebee · 1d ago
The ideal solution for health checks is to set both a maximum timeout duration, and a maximum number of retries. Typically you would want to fail after X retries first, and up to Y time (to account for network weirdness). But you definitely want to fail earlier, and not just wait for a long-ass time to pass before you finally fail.

That's for a standard service health check anyway. That service and health check shouldn't be started until the container it depends on has started and is healthy. In Kubernetes that's an Init Container in a Pod, in AWS ECS that's a dependsOn stanza in your Task Container Definition, and for Docker Compose it's the depends_on stanza in a Services entry.

  set -eu
  nowtime="$(date +%s)"
  maxwait=300
  maxloop=5
  c=0
  while [ $c -lt $maxloop ] ; do
      if timeout "$maxwait" curl --silent --fail-with-body 10.0.0.1:8080/health ; then
          exit 0
      else
          sleep 1
      fi
      if [ "$(date +%s)" -gt "$((nowtime+maxwait))" ] ; then
          echo "$0: Error: max wait time $maxwait exceeded"
          exit 1
      fi
      c=$((c+1))
  done
However, Curl already supports this natively so there's no need to write a script.

  curl --silent --fail-with-body --connect-timeout 5 --retry-all-errors --retry-delay 1 --retry-max-time 300 --retry 300 10.0.0.1:8080/health
0xbadcafebee · 20h ago
(edit: I forgot after "done" you need "exit 1" so it fails with an error after the max loop count, and "--retry 300" should be "--retry 5" for consistency with the script)
robinhouston · 1d ago
I’ve been playing around with trying to make a timeout using just bash builtins, motivated by the fact that my Mac doesn’t have the timeout command.

I haven’t quite been able to do it using _only_ builtins, but if you allow the sleep command (which has been standardised since the first version of POSIX, so it should be available pretty much anywhere that makes any sort of attempt to be POSIX compliant), then this seems ok:

  # TIMEOUT SYSTEM
  #
  # Defines a timeout function:
  #
  # Usage: timeout <num_seconds> <command>
  #
  # which runs <command> after <num_seconds> have elapsed, if the script
  # has not exited by then.

  _alarm() {
      local timeout=$1

      # Spawn a subshell that sleeps for $timeout seconds
      # and then sends us SIGALRM
      (
          sleep "$timeout"
          kill -ALRM $$
      ) &

      # If this shell exits before the timeout has fired,
      # clean up by killing the subshell
      subshell_pid=$!
      trap _cleanup EXIT
  }

  _cleanup() {
      if [ -n "$subshell_pid" ]
      then
          kill "$subshell_pid"
      fi
  }

  timeout() {
      local timeout=$1
      local command=$2

      trap "$command" ALRM
      _alarm "$timeout"
  }

  # MAIN PROGRAM

  times_up() {
      echo 'TIME OUT!'
      subshell_pid=
      exit 1
  }

  timeout 10 times_up

  for i in {1..20}
  do
      sleep 1
      echo $i
  done
ryao · 1d ago
Here is my take on this from 12 years ago after following the advice of a stack overflow post:

https://github.com/gentoo/genkernel/commit/a21728ae287e988a1...

With that (minus the gen_die() line unless you copy that helper function too), you can do:

  doSomething() {
      for i in {1..20}
      do
          sleep 1
          echo $i
      done
  }
  if ! call_func_timeout doSomething 10; then
      echo 'TIME OUT!'
      exit 1
  fi
Similarly to you, I only used shell builtins, plus the sleep command. The genkernel code is run by busybox ash, so the script had to be POSIX conformant. Note that both your script and my example script reimplementing your script with my code from 12 years ago, use {1..20}, which I believe is a bashism and is not POSIX conformant, but that is fine for your use case.

My innovation over the stack overflow post was to have the exit status return true when the timeout did not trigger and false when the time out did trigger, so that error handling could be done inline in the main script (even if that error handling is just printing a message and exiting). I felt that made code using this easy to read.

khc · 1d ago
I've written something 13 years ago using `read -t` https://github.com/kahing/bin/blob/master/timeout.sh
timewizard · 1d ago

    <command> & sleep <timeout>; kill -SIGALRM %1
robinhouston · 1d ago
That's ok, but %1 will refer to the wrong job if there's already a background job running. You could use %% instead, but that will refer to the wrong job if <command> terminates before the timeout.
fpoling · 1d ago
This always sleep for timeout rather than terminating when the command terminates.
craigds · 1d ago
FYI curl actually helpfully has a `--retry-connrefused` flag to avoid doing this loop in the shell entirely
aidenn0 · 2d ago
Note that if you need to pass variables into the bash -c invocation, the best way to do it is to append them. e.g.

   bash -c 'some command "$1" "$2"' -- "$var1" "$var2"
I use "--" because I like the way it looks but the first parameter goes in argv[0] which doesn't expand in "$@" so IMO something other than an argument should go there for clarity.

Note that bash specifically has printf %q which could alternatively be used, but I prefer to use bourne-compatible things when the bash version isn't significantly cleaner.

AdieuToLogic · 1d ago
> I use "--" because I like the way it looks ...

Double hyphens ('--') has a very specific meaning to bash and most every Unix/Linux CLI program. From getopts(1p)[0]:

  Any of the following shall identify the end of options: the 
  first "--" argument that is not an option-argument, finding 
  an argument that is not an option- argument and does not 
  begin with a '-', or encountering an error.
0 - https://www.man7.org/linux/man-pages/man1/getopts.1p.html
aidenn0 · 23h ago
That's what I meant by "I like the way it looks." It doesn't actually work that way with a bash -c invocation (all arguments after the command string go in argv, starting with 0), but it "fits in" with other commands that do work that way.
AdieuToLogic · 12h ago
> That's what I meant by "I like the way it looks." It doesn't actually work that way with a bash -c invocation (all arguments after the command string go in argv, starting with 0), but it "fits in" with other commands that do work that way.

This is not how "--" works in the example you provided:

  bash -c 'some command "$1" "$2"' -- "$var1" "$var2"
While the double-hyphens are assigned to `$0` in this example, they also serve to halt bash's interpretation of command line switches once encountered. For example:

  bash -c '/bin/ps "$1"' -- "-l"
Behaves as one would expect. However:

  bash -c '/bin/ps "$1"' "-l"
Will not.
aidenn0 · 1h ago
> While the double-hyphens are assigned to `$0` in this example, they also serve to halt bash's interpretation of command line switches once encountered. For example...

That is contrary to both the bash manpage and my tests. Here's the simplest test.

  bash -c 'false; echo hi' -e
Will print "hi" because -e is not interpreted as a command line switch despite no "--" being present (nothing after the "-c" is) while

  bash -e -c 'false; echo hi'
Does what one expects.

Or another way of putting it:

  bash -c '/bin/ps "$1"' foo "-l"
Will work exactly the same as

  bash -c '/bin/ps "$1"' -- "-l"
Since the "--" has zero function on this commandline except as a placeholder for argv[0].
fragmede · 1d ago
Busybox uses argv[0] to know what to run, so you can feed it "ls" as argv[0] and it'll run "ls" (or "mv"/"cp"/etc).
noufalibrahim · 2d ago
I used to use

    timeout 1800 mplayer show.mp4 ; sudo pm-suspend
As my poor man's parental control to let my kids watch a show for 30 minutes without manual supervision when they were younger. Useful command
gbraad · 1d ago
That is probably the best described use case!
noufalibrahim · 1d ago
I read somewhere that "when there's a shell, there's a way". It's tongue in cheek but somewhat true. The leverage that these simple commands and a programmable shell gives you is huge and can't be overstated
miduil · 2d ago
What I usually do when I need a retry logic is

     for i in {0..60}; do
         true -- "$i" # shelleck surpression
         if eventually_succeeds; then break; fi
         sleep 1s
     done

Not super elegant, but relatively correct, next level is exponential back off. Generally leaves a bit of composability around.
mdaniel · 2d ago
Up to you but I think the way shellcheck wants that problem solved is by using _ as in

  for _ in
https://github.com/koalaman/shellcheck/wiki/SC2034#intention...

No comments yet

miduil · 2d ago
Note this will still require timeout for eventually_succeeds depending on the application.

In Bash, or literally whenever you are dealing with POSIX/IO/Processes, you need to work with defensive coding practices.

Whatever you do has consequences

xx__yy · 1d ago
I like this solution better, no bash running a string command.

Could easily prefix eventually_succeeds with timeout.

epr · 1d ago
I'm generally not a huge fan of inlining the command or cluttering up my local directory with little scripts to get around the fact that it must be a subprocess you can send a signal to. I use a wrapper like this, which exports a function containing whatever complex logic I want to time out. The funky quoting in the timeout bash -c argument is a generalized version of what aidenn0 mentioned in another comment here (passing in args safely to subproc).

    #!/usr/bin/env bash

    long_fn () { # this can contain anything, like OPs until curl loop
      sleep $1
    }

    # to TIMEOUT_DURATION BASH_FN_NAME BASH_FN_ARGS...
    to () {
      local duration="$1"; shift
      local fn_name="$1"; shift
      export -f "$fn_name"
      timeout "$duration" bash -c "$fn_name"'  "$@"' _ $@
    }

    time to 1s long_fn 5 # will report it ran 1 second
abbeyj · 1d ago
You need `"$@"`, not just `$@` at the end of the command. Otherwise it will split any arguments that have spaces in them. E.g. try

    long_fn() {
      echo "$1"
      sleep "$2"
    }

    to 1s long_fn "This has spaces in it" 5
epr · 1d ago
My bad on that typo. I write "$@" so often in shell scripts that I should know better. Also would've been caught by shellcheck. Outside the hn edit window though, so my mistake is permanent :(
pveierland · 2d ago
Literally just added some command timeouts in a new kubernetes setup. This POSIX shell script implementation of await-cmd.sh / await-http.sh / await-tcp.sh is mature and quite handy in some scenarios:

https://github.com/vegardit/await.sh

frou_dh · 2d ago
Apparently timeout(1) is part of GNU Coreutils. I wasn't sure after reading whether it was part of Bash itself.
mdaniel · 2d ago
Also, watch out because like many things the timeout command and args differs between /usr/bin/timeout or gtimeout in Brew (that's where the "g" prefix comes from). I haven't used BSD in order to know what it's story is
aidenn0 · 2d ago
Prefixing GNU coreutils with "g" is common on most non-Linux Unix systems; it prevents conflicts with the base system (gmake/gtar vs make/tar).
jonhohle · 1d ago
But also sucks because the g-prefixed versions aren’t installed on Linux systems which means scripts that rely on them are not portable.
mdaniel · 1d ago
Thankfully bash tolerates that, if the script author cares, e.g.

  gnu_sed=gsed
  if ! command -v $gnu_sed; then
    gnu_sed=$(detector_wizardry)
  fi
  $gnu_sed -Ee ...
chasil · 2d ago
> It’s a shame we can’t use timeout with until directly

The until keyword is part of the POSIX.2 shell specification, which does not include any sort of timeout functionality. It could be implemented in bash, but it would not be portable to other shells (Debian dash being the main concern).

This is the reason that it is implemented as a separate utility.

Search for "The until loop" below to see the specification.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...

PeterWhittaker · 2d ago
I tend to do something like this. Normally, I wouldn't include the extra jobs calls and extra echo calls, these are just to show what is happening.

  #!/usr/bin/env bash
  
  runUntilDoneOrTimeout () {
      local -i timeout=0
      OPTIND=1
      while getopts "t:" opt; do
          case $opt in
              t) timeout=$OPTARG;;
          esac
      done
      shift $((OPTIND - 1))
      runCommand="$*"
      $runCommand &
      runPID=$!
      echo checking jobs
      jobs # just to prove there are some
      echo job check complete
      while jobs %- >& /dev/null && ((timeout > 0)); do
          echo "waiting for $runCommand for $timeout seconds"
          sleep 1
          ((timeout--))
      done
      if (( timeout == 0 )); then
          echo "$runCommand timed out"
          kill -9 $runPID
          wait $runPID
      else
          echo "$runCommand completed"
      fi
      echo checking jobs
      jobs # just to prove there are none
      echo job check complete
  }
  
  declare -i timeopt=10
  declare -i sleepopt=100
  OPTIND=1
  while getopts "t:s:" opt; do
      case $opt in
          t) timeopt=$OPTARG;;
          s) sleepopt=$OPTARG;;
      esac
  done
  shift $((OPTIND - 1))
  runUntilDoneOrTimeout -t $timeopt sleep $sleepopt
oso2k · 1d ago
Another fun way to test connectivity in pure bash (need a revision from the past 15 years) is

   timeout 5 bash -c 'cat < /dev/null > /dev/tcp/google.com/80'

Replace google.com and port 80 with your web or tcp server (ssh too!). The command will error/time out if there isn’t a server listening or you have some firewall/proxy in the way.
halJordan · 1d ago
Timeout is an external program to bash
arjie · 1d ago
A friend recently showed me https://google.github.io/zx/api and it's actually quite enjoyable to use. Very close to a shell and LLMs know it quite well.
jiehong · 1d ago
This reminds me of bun with its shell api for JavaScript [0].

[0]: https://bun.sh/docs/runtime/shell

artursapek · 1d ago
ah reminds me of the jQuery glory days
minaguib · 2d ago
In _this_ particular case, you could just tell curl to internally timeout the request (via `-m`) instead of trying to manage the timeout on the process level
aidenn0 · 2d ago
Not really, since it's calling `curl` in a loop, and they want the loop to timeout. There's possibly a set of options to curl to make it retry for a certain amount of time but I don't know it off the top of my head.
pvtmert · 17h ago
i think curl already have extensive options to handle such cases.

there are multiple docs with nice explanations too, for example:

https://everything.curl.dev/usingcurl/downloads/retry.html

i guess the premise is applicable to other processes which do not implement timeouts (eg: find) themselves

broken_broken_ · 1d ago
That reminds me of a blog article I wrote some time ago, where “timeout” gets mentioned: https://gaultier.github.io/blog/way_too_many_ways_to_wait_fo...

It’s more useful if you are implementing this in a general programming language, not in the shell, or if you want to know how it works under the hood.

sllabres · 1d ago
From personal experience I would always recommend an output of how many retries were necessary, if one expect zero. Otherwise the retry loop can hide a problems like an unreliable service or network until it's too late.
chii · 2d ago
could you instead just add a count to how many times the sleep was invoked, and then add that check into the `until` condition to quit after X numbers of sleeps?

You dont need to timeout here, and you won't need to subshell another bash to just get the timeout to work.

xelxebar · 2d ago
For the author's stated purpose, it sounds like this would work well. However, curl can take a long time to timeout of whatever, depending on server state. I'm curious how people would approach a guaranteeing a maximum time to response or error.
cb321 · 1d ago
It's not POSIX, but both bash & zsh have real time queries built-in (of course, $(date) is an even older school substitute):

    t1=$((EPOCHSECONDS + 60))
    while [ $EPOCHSECONDS -lt $t1 ]
    do # curl ... && break # or whatnot
    done
There is also EPOCHREALTIME which gives you a floating point to micro or nanoseconds (for bash/zsh), but only Zsh provides FP arithmetic. There are string-manipulation workarounds, of course. And, yes, with Zsh you might need a `zmodload zsh/datetime` in there.

These variables seem "under known". EDIT: For example, you can get a quickie wall time measurement from a Zsh shell function like this:

    dt () {
        t0=$EPOCHREALTIME 
        "$@"
        printf "%.7f wallSec\n" "$((EPOCHREALTIME-t0))" >&2
    }
And then you can actually run in your shell

    dt echo hi, moon
without even a single fork/clone. (which you can confirm with an off to the side `strace -fv -o/dev/shm/dt.st -p WHATEVER_PID`, although I guess the culture these days is often to have even prompt printing launch a zoo of activity)
chii · 2d ago
In the case of curl, there is a timeout parameter for requests and connections etc. It's what i'd use for checking for a server being up.

But in the general case, where the command being invoked does not have such an option, then it does make a lot of sense to do a check like that via the `timeout` utility.

diggan · 2d ago
For curl:

--connect-timeout = Times out if the connection wasn't established within N seconds

--max-time = Times out if the entire request wasn't completed within N seconds

But then I don't remember if connect-timeout takes DNS lookups into account, or TLS handshakes. I seem to remember there is another sort of timeout that tends to be hard to get right when the connections are flaky/drops a lot of packets, so you end up having to wrap curl anyways if you want a hard limit on the timeout.

SoftTalker · 1d ago
I've had many, many cases of processes in Linux getting wedged up bad enough that they cannot be killed. Usually this seems to involve them waiting on I/O.
chgs · 1d ago
Kill -9 won’t help in those cases though
crabbone · 1d ago
Not really (at least, not very easily). There's no guarantee that for whatever reason curl won't hang.

To do it properly, you'd need some code before the loop to start a separate process that would check on the parent process... but, really, you don't want to go there, not in Bash anyways.

But, assuming curl won't hang, you could compare timestamps. It's better than counting iterations (in terms of emulating timeout command).

But then, you might want to get fancy and implement exponential backoff or whatever other strategy you fancy to not overload the whatever thing you are polling... again, probably not in Bash.

febusravenga · 1d ago
This is my attempt to reinvent wheel from several years ago: https://github.com/zbigg/bashfoo/blob/master/timeout.sh

This is very complex, because if you.write lots of functions that call functions, you really just want to run something that inherits while env from your process, that's why there is control and sleep process and naive race to decide which finished first...

That's probably reason I ignored built-in timeout...

linsomniac · 1d ago
Anyone know why shell scripts can't set alarm(2)? I assume it's because the shell is already using it for its own needs.
lzy · 1d ago
Neat idea. I’ve definitely been burned by silent timeouts in production before. Curious how this handles more complex cases like nested async calls or third-party dependencies that don't expose good hooks. Would be cool if this could somehow integrate with logging tools directly for more visbility.
AtlasBarfed · 2d ago
Is there a language with a less standardized standard library than bash?

Is there an attempt anywhere to build a slightly modern standard library for bash scripts?

You know besides stack overflow?

t-3 · 1d ago
Busybox/coreutils/the userspace of your platform are the "standard library". The shell is basically just there for control flow and IO, everything else is just programs on your computer.
cpach · 2d ago
Would it really be worth the effort? When this level of complexity is reached, I personally think it’s better to use a more capable language, such as Python or Ruby.
ninkendo · 2d ago
Not only that, but the strength of bash is its ubiquity. (Or for that matter, posix sh, if you want even more ubiquity.) If we started adding lots of features to bash, it wouldn’t make sense to use them unless you are positive every place using your script has a new-enough version installed. Which defeats the purpose of the main use case for bash in the first place, which is (IMO) for portable scripts that will run on any Unix-like system.
AtlasBarfed · 1d ago
I may not like bash, but I sure as hell use it a lot...

And I like messy langs. My favorite language is groovy.

crabbone · 1d ago
Neither Python nor Ruby offer a simple interface to concurrency, serialization and terseness. Not even close. And aren't moving in the desired direction. Perl could have tried... but it's a separate can of worms, and still not quite there.

PowerShell is a missed opportunity. A project with a ton of resources dedicated by a company with bottomless coffers... which ended up being sub-par.

I wish there was a sensible alternative, but I haven't found one yet.

beej71 · 1d ago
POSIX and the Single Unix Specification are pretty much all you have

I write a lot of shell scripts and they tend to be POSIX-compliant. For dependencies, you can use the `command` command to fail elegantly if they're not installed.

oweiler · 2d ago
Bash has no standard library. It has builtins, and commands. And commands are just external tools.
queuebert · 1d ago
Even '[' is an external binary in /usr/bin typically.
AStonesThrow · 1d ago
"Standard library" is sort of a C-specific term.

"builtins" are primitives that Bash can use internally without calling fork()/exec(). In fact, builtins originated in the Bourne shell to operate on the current shell process, because they would have no effect in a subprocess or subshell.

In addition to builtins and commands, Bash also defines "reserved words", which are keywords to make loops and control the flow of the script.

https://www.gnu.org/software/bash/manual/bash.html#Reserved-...

Many distros will ship a default or skeleton .bashrc which includes some useful aliases and functions. This is sort of like a "standard library", if you like having 14 different standards.

https://gist.github.com/marioBonales/1637696

'[' is an external binary in order to catch any shell or script that does not interpret it as a builtin operator. There may be a couple more. Under normal circumstances, it won't actually be invoked, as a Bash script would interpret '[' as the 'test' builtin.

eemil · 1d ago
Every time I learn a new Bash trick or quirk, it just pushes me further towards Powershell and Python for system administration.

Bash scripts are so hacky. With any other language, my though process is "what's the syntax again? Oh right.." but with bash it's "how can I do this and avoid shooting myself in the foot?" when doing anything moderately complex like a for loop.

alganet · 1d ago
https://github.com/shellfire-dev

https://github.com/shellspec

https://oils.pub/

There's probably more.

The shell has a ____wide____ userbase with many kinds of users. Depending on your goal, the rabbit hole can go very deep (how portable across interpreters, how dependant on other binaries, how early can it work in a bootstrap scenario, etc).

These are mine:

https://github.com/alganet/coral

https://github.com/alganet/shell-versions

https://github.com/Mosai/workshop

kazinator · 13h ago
> We were using the Bash built-in until to check if the web server was up:

That saves you a whole character of typing:

  until command ; do ... done

  -->

  while ! command; do ... done
tryauuum · 1d ago
why didn't he opt to use `timeout --signal=SIGKILL` and instead wrapped everything in extra bash to make it more killable?..
teo_zero · 1d ago
According to TFA you can only kill processes and "until", being a builtin, doesn't spawn any new process.
js2 · 1d ago
Retry is also a nice little utility that makes the retry loop easier:

https://github.com/minfrin/retry

exo762 · 1d ago
I have this in my .bashrc:

  function retry {  
      until $@; do :; done  
      alert  
  }
  export -f retry
Works reasonably well for non-scripting usecases.
linsomniac · 1d ago
Anyone know why we don't have alarm(2) access in the shell? I assume it is because the shell is already using alarm?
anotherevan · 1d ago
TIL Bash has `until` as well as `while`!
yonatan8070 · 1d ago
I recently used timeout + tcpdump to bandaid over a race condition where sometimes a video streaming service started before the camera was ready and got stuck in a loop. So I just captured the video stream's port with tcpdump, then used timeout and tcpdump's exit code to tell if it's working or not
leitasat · 1d ago
Have you heard of exponential back off? Tl;dr make the sleep time dependent on the mumtof retries
deafpolygon · 1d ago
curl has a timeout setting

    --connect-timeout <seconds>
and retry:

    --retry <num>
so you could do

    curl --retry 5 --connect-timeout 10
TacticalCoder · 1d ago
I've got, since forever, an advanced Bash prompt. But I also don't want my Bash prompt to have any visible delay. So back in the days I came up with time outs working with milliseconds (which, AFAIR, isn't the case for the timeout command whose granularity is seconds at best?). It involved processes and killing etc. but it got me what I wanted: either an instant prompt with all the infos I want of an instant prompt which may miss one or two infos. I much prefer that to the "my prompt contains no information because that's quicker".

Been working flawlessly since 20 years: so flawlessly that I don't remember how it works.