Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wow, this compiles fast. And the Flex and Bison files are there. This seems hackable. Nice work!

My only questions are (they are always the same with any purported sed/awk "replacement"):

1. what was the problem you were trying to solve where sed and awk failed you, and

2. does this program operate line by line or does it read entire files into memory?

I had to deal with some JSON a while ago and threw together some sed like this just so I could read it:

    sed '
    s/,/&\
    /g;
    /^$/d;
    s/^[{][^}]/\
    &/g; 
    /\"/s/,/<##eol##>/g;
    s/ *//;
    ' |tr '\012' '\040' |sed '
    s/<##eol##>/\
    /g;
    s/\[/\
    &\
    /;
    s/\]/\
    &\
    /;
    s/,  /, /g;
    '
But any difficuly I have dealing with JSON I attribute to the pervasive use of JSON, not sed.


Thanks!

1. I think your example shows sed/awk's failings with JSON data :) I don't want to write a JSON parser by hand every time I want to pull a field out of an object, and parsing recursive structures with regexes is never a good plan.

2. It reads JSON items from stdin into memory, one at a time. So if the input is a single giant JSON value, it all goes into memory, but if it's a series of whitespace-separated values they'll be processed sequentially.

It's cat-friendly: if you do

    cat a | jq foo; cat b | jq foo
then it's the same as doing

    cat a b | jq foo


1. But those are general statements. Opinions. What I mean is give me a specific case. A specific example, a specific block of JSON and a specific task. Once I have that, then I can ask myself "Is this something I would ever need to do or that I have to do on a regular basis?"

Sometimes I need to write one-off filters. There is just no getting around it. I have to choose a utility that gives maximum flexibility and is not too verbose; I don't like to type. Lots of people like Perl, and other similar scripting languages for writing one-off filters. But Perl, _out of the box_, is not a line-by-line filter. It's unlike sed/awk; it needs more memory. That brings us to #2.

2. If I understand correctly, jq is reading the entire JSON block into memory. This is what separates your program and so many other sed/awk "replacements" from sed and awk, the programs they purport to "replace". sed/awk don't read entire files into memory, they operate line-by-line and use a reasonably small buffer. Any sed/awk "replacement" would have to match that functionality. Given that sed/awk don't read in an entire structure (JSON, XML, etc.) before processing it, they are ideal for memory constrained environments. (As long as you don't overfill their buffers, which rarely happens in my experience.)

Anyway, so far I like this program. Best JSON filter I've seen yet (because I can hack on the lexer and parser you provided).

Well done.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: