Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Using Askgit – A SQL interface to your Git repository (willschenk.com)
176 points by goranmoomin on Aug 16, 2020 | hide | past | favorite | 30 comments


What would be nice is to add to this a way to plug language specific semantic analysers. Then it could be possible to do queries like "which commits did changes to a specific function" and so on.


That's in some sense already native git functionality.

(Though I recognize that the entire askgit here is essentially a layer on top of some existing functionality).

But changes to a specific function is a particularly underused feature in git.

You do `git log -L :nameOfFunction:fileItLives.ext` and git will show you changes to the function over time.

Of course that syntax is completely arcane (which contributes to its underuse), but e.g. for Python I just wrote a trivial thing to let me instead write `git pylog foo.bar.baz` and it translates to the right syntax above (that lives here: https://github.com/Julian/dotfiles/blob/76bd63f6c9a2c650c185...)


I have always wondered why GitHub doesn't add such functionality, its a no-brainer how much it would help to have a way to right-click a function and see how has it changed through history, it could use an interface where you scroll through time (eg the most you scroll the older versions you see, until you reach when it was created); same thing for files.


maybe some day after their search develops even the faintest idea of what programming language lexemes look like.


Presumably you’ve heard about git-blame, which is available on GitHub.


Yeah, I find it of little use due being syntax agnostic (e.g can't tell me about the changes on a function because it doesn't know what a function is). If you are taking about me asking the same for "files" I meant being able to see all the versions of a file in a single page (maybe using "infinite scroll" in case there are too many)


I'm the creator of AskGit - and this is something I've been thinking a bit about, unsure of the best way to implement and very open to ideas! Adding language specific analysis (this line is a comment, this is a function name, etc) is really interesting to me and I think could really kick this tool up a notch (there are still some more fundamental needs I'd like to address though, such as diffs and blames, query performance, etc).

I've been considering leveraging the syntax highlighting implementation in editors to make this possible (rather than something as heavy handed as parsing a "universal" AST, though I know there are projects to do this)...


Visual Studio has a limited version of what you want: if you start with a function it can show you the revision history.

https://docs.microsoft.com/en-us/visualstudio/ide/find-code-...

I think Rider has it as well.


Yep, jetbrains IDE's can show git history for a code selection. I've been finding it immensely useful.


I’m an absolute fanboy of all the efforts to bolt on SQL interfaces to things. I love this. Kudos to the creator.


If you like this, check out OctoSQL[0], it bolts on SQL on json, CSV, Excel and Parquet files.

It also let's you join them with each other (and other databases).

[0]:https://github.com/cube2222/octosql


There is also xsv [0] for CSV SQLing

[0] https://github.com/BurntSushi/xsv


For this and the above +1 to you both and thanks.


You are maybe aware of this already then but someone here might like to know that the Fossil DVCS is built on top of SQLite, by the same author (edit as pointed out by Gaelan:) as SQLite.


To be clear, the same author as SQLite, not Askgit.


Thanks for pointing out, I thought it was obvious from the context but I'm glad someone tells me when it is not.


I did not know that. Thank you!


Sqlite has an api for "virtual tables" that provides a fairly easy way to bolt SQL on top of things. https://www.sqlite.org/vtab.html

I haven't used it directly, but I did use a Perl module that hooks into it, and that made it very easy to cobble together. https://metacpan.org/pod/release/SALVA/SQLite-VirtualTable-0...


The VTAB extension is new. The documentation is a bit thin compared to the rest of SQLite. (To be clear: the documentation is good; just not fantastically awesome like the rest of SQLite.)

The main Achille's Heel of VTAB is wrapping your head around the indexing API: the basic idea is you need to tell the query planner which of a set of columns is best to work with. It is exceedingly unintuitive. This is compounded by the problem that the planner essentially refuses to do binary-search queries on more than one column of a VTAB. (Hell -- getting it to use any binary-search on any index is nearly impossible.) The result is that unless you're in a real bind, you should always translate your custom VTAB data into a reified table.

This is all compounded by the fact that the slightest misstep in the callback API will corrupt/confuse/lockup SQLite (the in-memory representation -- your data is still safe). I'm glad VTAB exists, and I know it's just a side-effect of the compression support, but goddamn it's frustrating for it to be almost be awesome...


Bummer. Perhaps over time these pain points will be remedied.


It'd be interesting to see some examples of queries that operate on the graph structure of the repo.

mercurial offers a language where you can build expressions to select revsets:

E.g.

> Changesets mentioning "bug" or "issue" that are not in a tagged release:

hg log -r "(keyword(bug) or keyword(issue)) and not ancestors(tag())"

https://hg.mozilla.org/mozilla-central/help/revsets


What are good APIs to git, besides git itself? Something high level, ergonomic, and with write operations, not just querying?


libgit2[0] has a comprehensive API and bindings for dozens of languages.

I also like go-git[1] - a pure Go implementation with idiomatic Go API and cool features like in-memory filesystem.

[0] https://libgit2.org/

[1] https://github.com/go-git/go-git


This is really nice. I experimented converting natural language to git log commands this week (https://twitter.com/danielbigham/status/1294461750251839489), but mapping natural language to git sql might be better and more flexible in some cases.


If adding a SQL interface makes a system easier to access, that says a lot about how convoluted the original interface is...

Next logical step : a kernel level SQL interpreter integrated in systemd.


Actually I think it says a lot about how expressive SQL is.

I get why a lot of developers take swipes at SQL - it is a bit different to other languages. But you cannot beat it for what it does.

Another ‘great’ language is XSLT - so far I’ve seen nothing else that comes close for transforming data. It’s just a shame it’s so closely tied to XML which has understandably fallen out of favour


I’m not sure there’s much truth to your statement. There’s nothing inherently complex about git’s internal data structure nor does putting a SQL interface on something indicate a level of complexity.


As arankine noted there is the platform independent osquery. There is also SQL for WMI [1], which predates osquery, I believe.

[1]: https://docs.microsoft.com/en-us/windows/win32/wmisdk/queryi...


There is OSQuery (https://osquery.io/)!


Most people only need to know git commit, git push, git pull, and just a handful of commands.

It is by design that something like this is not included into the original interface. It would be. terrible design.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: