> Disk intensive daily jobs shouldn't run if the system is being crushed. ionice...

digi_owl · on Nov 4, 2015

> They are decent reasons for improving cron. ;)

Or improve ones usage of cron.

In the end you are dealing with unix. Take a whole bunch of small dumb things and combine them to do large (seemingly) smart things.

If "you" want a single monolithic smart thing i guess you can always inquire at MIT about how their skynet project is coming along.

I find myself reminded of a line i often hear, and sometimes use, when a task needs clarification: "do as i think, not as i say".

voidz · on Nov 3, 2015

Came here to make a similar point about ionice and chrt. I'm amazed by the lack of familiarity with the power of the Linex schedulers.

simoncion · on Nov 3, 2015

I had forgotten about chrt!

Man, its functionality probably should be folded into nice(1) and renice(1). I'm kinda on the fence about whether or not ionice should also be folded into nice/renice.

click170 · on Nov 3, 2015

I think there's an argument to be made that these things dont belong in cron, they belong in logic that lives elsewhere.

Do one thing and do it well.

simoncion · on Nov 3, 2015

> Do one thing and do it well.

The trick is determining what thing you're going to do.

I think it's perfectly reasonable for a job scheduler to have the ability to terminate jobs that run for too long and to ensure that no more than N copies of a given job are running simultaneously. This sort of thing also satisfies the DRY principle. :)

feld · on Nov 3, 2015

> Do one thing and do it well.

People really like to parrot this statement. Nobody is going to make it do "more" than a scheduling tool. It's not going to become a volume manager or start steering your clock...

Why aren't you angry about these redundancies:

  * sort's -u flag when there's uniq ?
  * grep's ability to search recursively when there's find ?
  * grep's existence when there's ed ?
  * cron's existence when there's at ?
  * cat's existence when there's < ?

We could play this game all day.

dijit · on Nov 3, 2015

many of those things dont actually do the same thing.

at runs a scheduled job once and then terminates, cron runs scheduled jobs repeatedly.

ed/sed are powerful languages that are used for 'changing' data, not searching it.. it _can_ be used to search but it's very heavy for the task.

grep also searches content of files recursively, where as find does not, having find do it means forking the process and being very heavy also.

cat and < are different in that one is a shell builtin and is used to.. concatonate files.. which the shell builtins do not do... ofc many people "use it wrong" but that's neither here nor there in this discussion. https://en.wikipedia.org/wiki/Cat_(Unix)#Useless_use_of_cat

simoncion · on Nov 3, 2015

/me puts on Devil's Advocate hat

> at runs a scheduled job once and then terminates, cron runs scheduled jobs repeatedly.

Make recurring scheduled jobs reschedule themselves as their last action. This makes your job scheduler do one thing and do it well. :P

> [ed] _can_ be used to search but it's very heavy for the task.

So? ed was around before grep. Grep is therefore redundant.

> grep also searches content of files recursively, where as find does not.

The point there was that find can recursively create a list of files for grep to search, passing them along to grep. This means that grep's -r switch is redundant.

/me removes Devil's Advocate hat

> cat and < are different...

If I had more time today, I would figure out if one can abuse the hell out of shell redirection to replace cat. It sounds like a fun way to waste a half-hour.

atombender · on Nov 4, 2015

It can be argued that Systemd does do just one thing, and well.

Unix has a process model which has historically been lax. Things built on top of it tend to be shell scripts. Cron is probably the worst example of "do one thing well", because it does that one thing very poorly. It can only start stuff. If you want to avoid dogpiling, enforce timeouts, enforce resources, prevent multiply-forking jobs from leaving orphan processes, preserve historical per-job output, pause or delay jobs, manually start jobs, balance jobs across multiple nodes, etc. — then you just have to cobble that together yourself. I certainly have.

What "cron" and "/etc/init.d/something" and "scripts that run on ifdown/ifup" and "scripts ran run on DHCP changes" and "scripts that run at specific times" have in common is that they are arbitrary processes whose lifetimes are bounded — by time (cron-like behaviour), dynamic events (such as networks going up or down), human input (manual start or stop) or other factors. There's no conceptual difference between a clock and a human action, for example. If job X is set to run at midnight, and I want to run it a little sooner today, couldn't I just trigger it manually? For Cron to have this, and for Cron to fix the other deficiencies I mentioned, it would have to replicate features to the point where it becomes... Systemd. Because Cron starting processes at specific times is just a special case of starting processes at boot (init) or starting processes when networking goes up (/etc/network/if-up.d or whatever).

Systemd is a process lifecycle supervisor. It tracks the lifetimes of processes. That's about it. Everything else is pragmatic problem solving. Just like grep's simplicity ("finds stuff by text") is clouded by the dozens of arcane flags it must support to real-world use cases (styles of regexps, reading patterns from a file, printing options), so must Systemd cater to many minor aspects.

It can be argued that Systemd could be less monolithic and more pluggable, but I feel that's another story.

simoncion · on Nov 4, 2015

> It can be argued that Systemd does do just one thing... Systemd is a process lifecycle supervisor. It tracks the lifetimes of processes. That's about it.

It's also a /dev node manager, and a syslogd replacement, and a cron replacement, and a dhcpcd replacement, and a mount/umount, insmod/modprobe reimplementation, [0] and a bootloader, [1] and...

The systemd project is huge, sprawling, and continues to grow.

> Cron is probably the worst example of "do one thing well", because it does that one thing very poorly. It can only start stuff. [If you want something sophisticated you inevitably have to implement systemd.]

It turns out that you don't! Check out fcron. Its crontab is described here: [2]. If you're impatient, search for "The options can be set either for every line".

> Things built on top of [Unix's process model] tend to be shell scripts.

The phrase "shell scripts" gets tossed around like it's a slur.

What are the essential differences between a Bash program, a Python program, a TCL program, an Erlang program, and a C program that all do the very same thing? What makes "shell scripts" so undesirable?

[0] Given the number of times things like recursive bind umounts, module loading/unloading, and more core stuff have broken on systemd-enabled systems and only such systems, it's almost impossible to believe that mount/umount, insmod/modprobe & etc. aren't being reimplemented in the systemd project. ;)

[1] https://wiki.archlinux.org/index.php/Systemd-boot

[2] http://fcron.free.fr/doc/en/fcrontab.5.html

atombender · on Nov 4, 2015

> Check out fcron ...

Well.

* Can it let me start jobs manually?

* Can I defer their scheduling programmatically without editing a file to comment it out?

* Can I add or remote new jobs programmatically and atomically?

* Can I ask it about the status of a job?

* If a job fails, is it retried, up to a limit?

* Can I give it a time budget after which the job is killed?

* Can I give it a resource budget (e.g. CPU usage)?

* Can it log the output tagged with an identifier that identifies the job uniquely across time?

* Can it notify me by some programmatic mechanism (so that I can pipe it to a webhook, say) if a job fails?

* Can I programmatically start jobs, edit jobs, etc. from a different host?

* Can I ensure that only a single instance of the job runs in a cluster, if multiple hosts have been configured with the same job?

* If the job runs so long that it overlaps with the next scheduled period, can it ensure that the next job doesn't start yet?

* If the job forks, can it ensure that the children are cleaned up?

These are some of the features that I want from a job scheduler. Yes, some of it can be cobbled together with wrapper scripts, manual cgroups, and so on. But I don't want to write all of that.

> What are the essential differences between a Bash program

Shell scripts are brittle. That's hardly debatable. You can write solid shell scripts, but people generally don't because it takes a lot of effort and arcane knowledge to do it successfully.

A lot of people don't know about classic safeguards such as "-e", "-o pipefail", exit traps and so on. Some of the safeguards you want are hard to do safely and atomically.

Unix has a nice pipe system, but shell scripts that need to drive a lot of things invariably mess it up because everything pipes to the same place. You get errors and information output intermingled and there's often no way to tell who emitted it without going step by step.

Throughout my 20+ years of using Unix variants, shell scripts and "shell-scripty interdependencies" have always been a source of problems, usually related to things like escaping, quoting, option parsing, alias expansion, variable expansion, bad toolchain detection (e.g. assuming your "ls" is GNU), ignoring exit codes, bad output parsing, etc.

"find | xargs" works just fine until the day you encounter a file name containing spaces. Then you do "find -print0 | xargs -0" and you're happy until the next edge case comes along that touches on space handling, of which the shell language family is rife. It gets worse when shell scripts try to be nice over SSH.

At least languages like Python and Erlang have first-class support for primitives, even if their dynamic typing leads to a lot of potential situations where the runtime parameters aren't what the author assumed, and things blow up because someone didn't sufficiently duck-type. I'm not a huge fan of Go, but Go is a surprisingly good fit for command-line tools, and its strictness and static typing helps avoid a whole class of errors.

I love shell scripts, and I write them nearly every day, but I would never build an OS on top of them.

simoncion · on Nov 4, 2015

Several of the items in your bulleted list make it seem like you either didn't read all of, or didn't thoroughly understand fcron's crontab manpage. I urge you to go back and re-read the documentation, or -at least- the Options section.

I'll address the items that aren't covered by the documentation that I linked to. :)

> * Can it let me start jobs manually?

> * Can I ask it about the status of a job?

Yep. [0]

> * Can I defer their scheduling programmatically without editing a file to comment it out?

I'm not sure what it means to "defer" scheduling. That sounds like changing the time a job starts. If I'm wrong about that, please do let me know. :)

Anyway. Every job scheduling system stores its non-volatile info on disk. fcron's full-featured interface to its job scheduling system is found in [1]. I don't know if that meets your requirement.

> * Can I programmatically start jobs, edit jobs, etc. from a different host?

Yep. [0] [1]

> * Can I add or [remove] new jobs programmatically and atomically?

AFAICT, yes. [1]

> * Can it log the output tagged with an identifier that identifies the job uniquely across time?

> * Can it notify me by some programmatic mechanism (so that I can pipe it to a webhook, say) if a job fails?

Yep. That's in the email that it sends. What your reporting infrastructure does with that email is up to you.

> * Can I ensure that only a single instance of the job runs in a cluster, if multiple hosts have been configured with the same job?

AFAIK, Systemd doesn't do this, so I don't know why you're asking for it. (Other than the fact that it would be rather nice to have.) :) But, no. AFAICT, fcron runs on a single host. I have heard about chronos, though. [2]

So, I need to preface the rest of this commentary with the following: Bash is clunky, kinda unwieldy, and very, very far from my favorite language. There are several reasons why I select Python for scripts that get much more complex than "rather simple". Please keep this fact in mind while reading the rest of my commentary. :)

> Shell scripts are brittle. ... You can write solid shell scripts, but people generally don't because it takes a lot of effort and arcane knowledge to do it successfully.

But C programs written by people who can't be arsed are also brittle and -even worse- the language itself is filled with hidden and subtle pitfalls!

> Unix has a nice pipe system, but shell scripts that need to drive a lot of things invariably mess it up because everything pipes to the same place [and this makes unwinding intermingled output difficult sometimes].

So, why not religiously do something like using logger(1) with appropriate tags and ids? I mean, you actually have to work a little to get reasonable logging in (almost?) every language. Hell, -IIRC- C doesn't ship with anything substantially more sophisticated than printf(3) and friends. ;)

> Then you do "find -print0 | xargs -0" and you're happy until the next edge case comes along that touches on space handling...

How does

  IFS=$'\n'

and/or the

  "${VAR}"

pattern not fix all space handling problems when using bash? (Bash is one of my weaker languages, so if you know, I'm genuinely interested in hearing about it.)

However, if your answer is something like "Some other script you use might screw things up!", then my reply is "Bash and sh programs are not the only languages in which we might find an erroneous program on which we depend.". :)

> It gets worse when shell scripts try to be nice over SSH.

How's that? I'm seriously asking, for my own edification.

> I love shell scripts, and I write them nearly every day, but I would never build an OS on top of them.

I don't think that I (or anyone else who has an informed opinion on the topic of sysvrc replacements) has ever seriously suggested that one built an entire OS on top of shell scripts. Frankly, they're insufficiently powerful (not to mention [comparatively speaking] too damn slow) to reasonably accomplish the task.

However, OpenRC -and so many other RC and init systems- demonstrate that shell scripts can provide rather powerful, relatively foolproof methods for a packager or service author to control services on their system.

> At least ... Erlang [has] first-class support for primitives, even if their dynamic typing leads to a lot of potential situations where the runtime parameters aren't what the author assumed, and things blow up because someone didn't sufficiently duck-type.

I am totally, seriously, not (!!) getting on your case here, but how much work have you done with Erlang? I'm a novice-to-middling Erlang programmer, and the situation you described seems... difficult to run into unless you're unusually careless. (Python, OTOH...)

> I'm not a huge fan of Go...

Me either! It's so opinionated. Does it still consider unused imports to be compile-stopping errors? :(

[0] http://fcron.free.fr/doc/en/fcrondyn.1.html

[1] http://fcron.free.fr/doc/en/fcrontab.1.html

[2] https://aphyr.com/posts/326-call-me-maybe-chronos

atombender · on Nov 4, 2015

I could talk about this all day, but I don't have the time, so I'll just address some of your more important points.

Fcron has some features that I like that I didn't find when I looked at it (the documentation is very poor: "dialog dyn-amically with a running fcron daemon" didn't really invite me to think that it had a job scheduling UI), but it's not sufficiently built out that I will be able to trade my current toolset for it. For example, it seems that it lacks any process isolation. Email-only reporting is not acceptable. And so on.

I agree that Fcron is a better Cron. But I also think that it's a very poor Systemd. Like Systemd, it controls the lifetime of processes; it has an overlapping feature set (job files, backing store, control UI) and if one made it complete, it would look a lot like Systemd. I'd rather have one tool do process orchestration, rather than two.

> But C programs written by people who can't be arsed are also brittle

Certainly. But it's hard to write C programs that fail for the same half-assed reasons that shell scripts do. C programs suffer from other problems. I'd much rather have a C program segfault on bad input right away than have a shell script happily continue to run because it ran some text file through tr, got a different result than it expected, stored it in a variable, passed it badly-quoted to some tool 27 lines down in the script, and triggered an impossibly obscure error message unrelated to the original cause.

But the real alternative isn't C, of course. It's one of the languages you mentioned. Ruby, Python, Erlang, Go, Nim, etc. all improve on the historical sloppiness of Unix.

> How does ... not fix all space handling problems when using bash?

There's much more to it than that, I believe.

For example, preserving quoted arguments is a nightmare. Let's say you read some file that contains command-line options. It contains something like:

    -x -y --string="hello world"

You want to read that, preserving quoting, into an environment variable, which you then want to modify and pass on to another script. Turns out this is hard. Normally, quoting "$foo" works; but the moment you fiddle with it, the quoting is lost. There's a neat bash trick few people know about which exploits its array support:

    #!/bin/bash
    configopts=`cat configfile`
    eval ARGS=($configopts)
    ARGS+=("--anotherflag")
    ARGS+=("$@")
    ./somecommand "${ARGS[@]}"

Throughout this script, $ARGS' internal token structure is preserved, even when it's manipulated. There are other ways to accomplish this, but none of them are pleasant, and this way is quite pleasant, once you get past the weirdness. But I would bet that most developers don't even know that bash has first-class array support. (If you know a better solution,

This is just one horrible corner case where a naive implementation would blow up. As I said originally, it's possible to write safe, resilient, mostly well-behaved shell scripts. But it's hard; you have to know about every possible pitfall [1] and divergent behaviour on GNU and BSD. Shell scripts implement most of their functionality through external commands, not functions, so you cannot assume much about the interface of commands. This takes time and effort. But shell scripts are easy to write, which invites sloppiness. Few people write unit tests for shell scripts. Few people use something like autoconf to sniff the runtime environment (e.g., GNU vs BSD).

As for "entire OS built on shell scripts", every Unix distro out there is built on tons of shell scripts. Even Upstart and Systemd can hardly avoid using a bit of scripting.

As for SSH, you're in a pretty good spot if you send your script using <<'EOF and disable the tty. Otherwise you run into quoting issues fast, and scripts that expect interactivity will hang. There might be some other corner cases, I forget. Controlling SSH from scripts has always been painful and brittle for me.

> [Erlang] ... difficult to run into unless you're unusually careless

Sure. But it's there. Erlang's duck typing means it's susceptible to sloppy inputs just like any other duck-typed language. It's a lot better than the shell script situation, to be sure.

[1] http://mywiki.wooledge.org/BashPitfalls

simoncion · on Nov 12, 2015

> (the documentation is very poor: "dialog dyn-amically with a running fcron daemon" didn't really invite me to think that it had a job scheduling UI)

I... uh.... read the documentation -rather than the second headings- and immediately understood that it had a job scheduling UI. :) I understand that we're all busy, but it does pay to take fifteen minutes or an hour every couple of weeks to sharpen the axe.

> it's not sufficiently built out that I will be able to trade my current toolset for it.

I never was claiming that you should. I was addressing your initial claim, mentioned a little bit later in this comment.

> For example, it seems that it lacks any process isolation.

Linux does that.

> Email-only reporting is not acceptable.

Consuming email is trivial. There are something like 349024823904 libraries and like a handful of good ones to do so in every popular language. :)

> I agree that Fcron is a better Cron. But I also think that it's a very poor Systemd.

Your initial assertion was that the only way to get a good cron was to implement systemd. I demonstrate that that's not true, and you counter with "Well, that cron's not systemd!"!

sigh.

> I'd much rather have a C program segfault on bad input right away...

You and I both know that it's trivial to write a C program that does exactly the same thing that your hypothetical poorly-written bash program does. :) Bash didn't invent Heisenbugs.

> It's one of the languages you mentioned. Ruby, Python, Erlang, Go, Nim, etc. all improve on the historical sloppiness of Unix.

/s/Unix/Bash or sh/ and I agree with your statement. I also note that Perl (shudder) is conspicuously absent from your list. :)

> There's much more to it than that, I believe. For example, preserving quoted arguments is a nightmare.

Thanks for the examples! I'll go over them over the next couple of days and see what I can learn from them. :D

> But I would bet that most developers don't even know that bash has first-class array support. (If you know a better solution,

Did you accidentally a sentence or two? Also, the recurring theme that I've been spotting here has been "Programmers who don't know their language often write bad or broken programs in that language.". Is this not one of the larger themes of your statements? :)

> Controlling SSH from scripts has always been painful and brittle for me.

Sorry, this is something that I've not yet had call to do. What exactly do you mean by "controlling SSH from shell scripts"? Something like

   ssh user@host "someRemoteCommand $AND_A_SCRIPT_VAR_TOO"

embedded in your script? (Or the equivalent here-document?)

> As for "entire OS built on shell scripts", every Unix distro out there is built on tons of shell scripts.

That hardly means that the entire OS is built on shell scripts. I argue that the vast majority of the shell scripts out there are to handle process startup. While this is quite important, it's important to remember that the stuff that actually gets work done in the OS is very rarely written in Bash.

> Erlang's duck typing means it's susceptible to sloppy inputs just like any other duck-typed language.

I see what you're saying (and agree with your final sentence), but one could just as justifiably say

"C and C++'s ability to permit the programmer to override the specified type of a given variable means that such programs are susceptible to errors of sloppy logic. We should really be using Haskell to write our system management programming." ;)

Anyway. Thanks for the ongoing conversation. This has been quite informative.

atombender · on Nov 12, 2015

Actually, you pointed me to Fcron, to which I countered that for Fcron to be an satisfactory scheduler (that is, satisfactory to me) it would have to become Systemd. I think I was pretty clear on that, actually.

No, I really did mean "historical sloppiness of Unix." It's a fine foundation, circa 1985, but it's time to make some progress. For example (just a minor example), the fact that Cron's error reporting is email is ridiculous. This means that the only way to collect reports is to either use a mail client, or to set up a special rule (in Postfix or whatever) that pipes incoming mail to a special directory or a process. Both options are insane, not to mention horribly brittle. Surely Cron violates the Unix principle (the "do one thing well" one) here; the right thing would be to not support email at all, but to mandate that every error report is piped through a per-job command. So PIPETO rather than MAILTO. If you really want email, according to the Unix principle, you just pipe to sendmail, after all. But a whole lot of the traditional Unix toolset is weirdly crufty this way.

As for shell scripting, my original point was precisely that the shell language family is very hard to get right, and so it's a poor foundation to build complicated, resilient moving parts on top of, emphasis on "moving". The situation is much worse than C; the class of possible bugs is completely different, and much more damaging. Things like process management should not be done using shell scripting. Lots of things suffer from bad scripting; autoconf, for example.

By controlling SSH: Yes, I mean interacting with SSH commands. You have to cover a bunch of pitfalls: disabling tty, disabling the control master multiplexing (has never been stable), always using heredocs, etc.

digi_owl · on Nov 4, 2015

Cobble together, composeable, same diff, no?