Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ok, let’s assume that 10mb json source was loaded into a not null-terminated opaque struct str_t {size_t; pchar;}. You have to parse a number from a position `i’ and you have (double parse_number(str_t)). Next obvious step?


You can keep the existing sscanf function and now strlen is O(1) so the bug is gone. Any questions?


I just can’t figure out the exact code.


I think your code would be pretty much the same, sscanf, strlen and all. The main differences would be the standard library's implementations of strlen and whatever function you use to read the file into a string in the first place.


  str_t json = loadfile();
  size_t offset = random();
  sscanf(“%d”, ?);
With opaque str_t you can’t just json[offset]. Should sscanf take offset with every string (sscanf(fmt, s, off))? Should we copy a slice of json and parse it? Should str_t have zerocopy mirroring ability (s2 = strmirror(s, off, len))? How many of these three are just a snakeoil that changes nothing?

It’s only pretty much the same until you try to write actual code with a new idea in mind.


You can offset your str_t by creating a new str_t that subtracts offs from the length and adds offs to the pchar. There is no need to keep track of the offset separately.


Rust's &str type, or the non-unicode-assuming version &[u8], allow creating (sub-)slices, which probably matches your strmirror function. Except that the same syntax works for all (dynamic/static) length arrays, and even allows custom code to e.g. transparently wrap an SoA[0] representation.

[0]: https://en.wikipedia.org/wiki/AoS_and_SoA


Well, in C++, it would read:

  int target;
  sscanf(json+offset, "%d", &target)
Where str_t's operator+ would look roughly like:

  str_t str_t::operator+(size_t offset) {
    return str_t{size - offset, ptr + offset};
  }
(Might look exactly like this, if str_t's constructor would throw if size was negative.)


Okay, I see what you're saying now. I haven't worked with C strings in a while. Python uses offset parameters or seek operations in various places, and C++ string streams have an inherent position too (C++ probably has a number of other ways to do it too...).


C++'s std::string_view is essentially your struct. You can check the methods it provides.


Yes, I’m aware of it. I’m just tired by these layman’s “oh that’s another reason to ditch C strings”, when it has nothing to do with it. Working with offsets requires handling offsets and lengths, be it explicit ‘off’ and ‘n’ or a string_view. All that is needed in this case in C is snscanf (note the ‘n’), so that it would know its limits apriori, like snprintf does. Sadly that ‘n’ never made it into the standard.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: