Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
GHunt – An OSINT tool to extract information about a Google account (github.com/mxrch)
247 points by jennifer_lopez on Oct 3, 2020 | hide | past | favorite | 39 comments


Unless something changed lately, this is a little deceiving.

If you check your own account, it finds albums and can even download some shared photos.

If you check the same account but using another account's cookies… surprise surprise, "No album" appears.

# Own account

    Name: ---public name----

    Last profile edit : 2019/12/01

    Email : ----------@gmail.com
     Google ID : ------id------

    Hangouts Bot : No

    Activated Google services :
     - Photos
     - Maps

    Google Photos : https://get.google.com/albumarchive/------id------
     => 4 albums, 50 photos

    Searching metadata...
     [+] 1 device found !

    - Apple iPhone 7 (40 pics) [2017/01/1]
     -> 1 Firmware found !
     --> 10.3.0 [2017/01/1]

    Google Maps : https://www.google.com/maps/contrib/------id------/reviews
     => 167 reviews found !

    Probable location (confidence => Okay) :
     - --------, ----------

# Other account

    Name: ---public name----

    Last profile edit : 2019/12/01

    Email : ----------@gmail.com
     Google ID : ------id------

    Hangouts Bot : No

    Activated Google services :
     - Photos
     - Maps

    Google Photos : https://get.google.com/albumarchive/------id------
     => No album

    Google Maps : https://www.google.com/maps/contrib/------id------/reviews
     => No reviews


From the readme:

> 02/10/2020 : Since few days ago, Google return a 404 when we try to access someone's Google Photos public albums, we can only access it if we have a link of one of his albums. Either this is a bug and this will be fixed, either it's a protection that we need to find how to bypass. So, currently, the photos & metadata module will always return "No albums" even if there is one.


I hate these date formats. I wish folks would use the iso standard YYYY-MM-DD


Agreed.

However YYYY-MM-DD is symmetric to DD-MM-YYYY ...


Dash positions say otherwise


Yeah, when I read a date and try to make sense of it I try below- but also often not correct:

. == day month year

/ == month day year

- == year month day


I don't understand what you are trying to say. YYYY-MM-DD is the reverse of DD-MM-YYYY and the other way round.


How convenient that this repo was published just a day too late. Now it’s just advertising something that it doesn’t do, either because it never did or because Google fixed it. I think it’s the former.


Doesn't really work for Google accounts that have reasonable privacy configuration, in which case it only tells you the email (which you specified), the account ID, the main account name and last profile edit. Of these the main account name might be the real name for many people, but I don't think that's required. The account name is also publicly used by Google in a whole bunch of places (e.g. when you read a Google Docs doc).

It also lists Youtube channels with similar names, which is generally wrong.


I think even if 99% of google users have strict privacy, this tool is still worth it for data brokers as they can run email lists of millions of users.

And every data about a user that can be connected to the rest of the master data is gold as its all about the network effect.

edit: https://imgur.com/a/o2petUC

here is a good example. The white box is from a publically available dataleak of a site called SexDating.se. its 10 years old db. but still i managed to extract the name of the person even though the original DB had no identifiers apart from this gmail.


This seems to be interesting, but I couldn't actually fetch android device information from the targets I have tried; THAT seemed to be private information that nobody should access, along installed apps.

Everything else seems just crawling/pinging for publicly available information.

Maybe the "Probable Location" is something a bit private if somebody didn't publish it in google maps explicitly... but then, it's not accurate.

It seems more of a tool for getting awareness of what information you're publishing on google, rather than something to actually fetch private info.


I agree with this. I'm worried about the probable location though, because I've realized that people tend to post reviews (even if it's a one-off) in a location where they live. All the probably locations it was able to guess, it was right.


Don't post reviews all around where you live, then.

But, actually the city where I live is a public information... it's on my linkedin profile, etc. I don't actually care.

This tool doesn't seem to be leaking private information of any kind. It's just a way to consolidate existing public information. I think pipl (pipl.com) used to do something like this across accounts, some 10+ years ago.


The first sentence reads as a personal attack. I don’t know if you meant it that way and I don’t mind, but that might explain why you are downvoted.

I don’t mind people knowing my location. But most people aren’t as tech/security savvy as the average HN user. I am worried for them.

And to second another commenter. OSINT is generally using public information in ways that you’re getting too much info about someone.

The coolest example of it are camera reflections or searching dns records about someone (IMO).


Oh, my. Not a personal attack, for sure. Why?

It just seemed a bit obvious to me. If I post a review, with my account, I have been there, right? And if I post a lot of reviews month after month in a certain place, I probably live there, right? This has nothing to do about being tech savvy. It's not about IP geolocation or recognizing user locations from random photos.

I understand the OSINT part as well. But the tool claims to be able to extract some info that should not be public (device info, installed apps). But it cannot.


> Why?

> Don't post reviews all around where you live, then.

Again, I don't mind, as I know letters without tone of voice are tough to interpret (and you don't seem to be downvoted anymore). But because you're using the imperative (I hope I use that word right), it can come across accusatory. It's possible that some people then imagine a bit of a negative tone of voice in such a statement. I know I do, and then I remind myself that I'm reading letters and that the imagined tone of voice (and whatever it might imply) is purely made up by myself. But some people may not do that.

> It just seemed a bit obvious to me. If I post a review, with my account, I have been there, right? And if I post a lot of reviews month after month in a certain place

It is obvious. Except for people that it isn't obvious too. I have a lot of family members who wouldn't think about this. They have a hard time using WhatsApp (barely able) or filling out tax forms. I don't see them immediately realizing that filling out Google reviews would leave them exposed in such a way.


> I don't see them immediately realizing that filling out Google reviews would leave them exposed in such a way.

I have that kind of family members as well. But they don't leave reviews on Google Maps, because they wouldn't know how to do that or why. And their home address is still on the White Pages, so.... I don't think it's a problem for them.

The question is: is there somebody who is a) able to use Google Maps - which implies a rough understanding of how it works - and b) able to leave reviews about a place they have been, that is then unable to realize that a review says "I have been there at that time" and that multiple reviews may suggest the (approximate) place where they live?


> This tool doesn't seem to be leaking private information of any kind. It's just a way to consolidate existing public information.

That's what's on the tin: "an OSINT tool"


Slightly tangentially, I haven't read anything by Kevin Mitnick for 20 years, but his whole thing on social engineering feels related to putting together innocuous bits of information that add up to more than the sum of their parts.

I have been at several companies where their competitive advantage was something extremely simple and basic that their competitors could do, but just hadn't twigged.

For example, in the pub one night, there were a couple of friendly competitors and one was complaining that they had the same machines but couldn't get the same output we did. My mate nearly blurted out that, "well, we just use the standard recommended XXX" but his boss gave him a kick just before he did. The competitors were simply cutting costs by not using the standard recommended XXX and that was literally the only thing that borked their quality, and yet they hadn't picked up on it.

The point is, the stuff that seems without value to you may be of incredible value to others. We have stated this time and time again on HN when it comes to privacy and data protection and yet I still need to see stuff like GHunt to remind me.

It's fun and scary to see how little tidbits of information are both easily available and easily linked. Assuming it works.


If you have to provide cookies to get data, then how it is a privacy breach? Typically no one other than you has the cookies.


Are there any statistics about how many users have not restricted their Google account sufficiently or like did not restrict it further than using the defaults?


Does this run under Linux? I can't install httpx...


Use a newer Python version


yes, but ofthen you have to use python3 and install pip3


I’ve had the chance to use Gatsby, Next, Nuxt and Hugo out of all the above my preference is Nuxt but it lacks graphQL. I’ve also used Gridsome which I love problem is I can’t really recommend it because I’m pretty sure the project is becoming abandoned... their official twitter account hasn’t posted anything since Sept. 2019 and there’s very few pushes to the repository which is a shame because it would have been my go to SSG.

Would love to see a Svelte SSG (Sapper’s nice but no movement there either).


you probably meant to post it here https://news.ycombinator.com/item?id=24670252


I did thank you!


>Would love to see a Svelte SSG (Sapper’s nice but no movement there either).

Check out Elder.js


Will check it out, thank you!


has anyone checked against python code if cookies are used properly and not sent somewhere ?


How does it work?


You can have nice autocomplete in hangouts that returns name and picture when you give it incomplete email. It also returns so called google id, which links all of your data. Instead of running this set of weird scripts you can open hangouts in browser, click "New Converstation" and start typing emails, then look in dev tools.

Then it just opens https://www.google.com/maps/contrib/+person id and there are your reviews on google map with locations obviously.

Then it tries to fetch your public albums from google photos using same technique, but that does not work anymore. When it did work, it extracted meta info from public albums. That's pretty much it.


Looks like Google has a mostly internal user id ("gaia_id"), once you've got this small identifier you can look on each Google related site for publicly available information relating to that user id.

Looks like Google is aware of it as the reference to gaia_id has been removed from Youtube pages source (Aug 2020), locked down public accessibility of photo albums (Sep 2020), and prevented connected email address leaks from Webmaster tools (Sep 2020).


hmmm... what is this leak from Webmasters?


Good question, found a reference to it in a related blog post: https://sector035.nl/articles/keeping-a-grip-on-google-ids and took it at face value. Can't find anything more looking further.


This won't stick around for long


The repo or the vulnerabilities?


Yes.


Excellent job. Another big warning about privacy & Google.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: