Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author here -- curious to hear HN's thoughts on if this should be fixed or is acceptable.


I don't recall needing to copy and paste a search URL to someone in a long time, but maybe other people use it in their workflow. (maybe in jest I'll use http://lmgtfy.com)

Interestingly enough in Safari on 10.11, the URL I see in the address bar when searching for "$x" is just "$x" but when I copy and paste "$x" into a text box I see my previous search in the url. I've noticed safari shortening to just the base recently but this feels a bit off.


> ...when I copy and paste "$x" into a text box...

What do you mean by this? The following things fail to repro the issue for me on Firefox 41:

* Open google.com

* Type a search in the search box, press return. (get a #q=$SEARCH anchor)

* Type another search in the search box, press return. (get a #q=SEARCH2 anchor)

*

* Go to google.com

* Type in the search box, press return. (Same result as above.)

* Paste text into the search box, press return. (Same result as above.)


In safari, I searched for "dog" then "cat", in the Address Bar I only see "dog" or "cat", but when I copy "cat" I get

https://www.google.com/search?client=safari&rls=en&q=dog&ie=...

If I do a new search from the bar I don't see dog, when I type "cat" into the google provided box in the page I get the above result.


It's a very safari specific thing.

address bar reads [$x], where $x is search term, on copy you get a url.


> address bar reads [$x], where $x is search term, on copy you get a url.

Yikes.


Image searches though.


I just linked an image search in an HN comment yesterday.

https://news.ycombinator.com/item?id=10459123

Deleted all the nonessential arguments by hand out of suspicion of something like this. Nice to know I'm not being too paranoid.


Not quite search-related but in my most common usage, I run across similar practices with copy/pasting URLs in general.

Some friends and I have a running multi-person Hangout that we use as a sort of simplified chat room. It's obviously got none of the control of something like IRC but it's easier and more convenient for most of them. About half of us already have Hangouts on our phones by default (Android) and the iOS folks can install it and alternately, many of us keep Gmail open in a browser tab at work anyway.

Getting to the point, we use the chat for just general BS throughout the day and we post a lot of links. I have a habit of seeing what I can strip off an article or other link before it doesn't resolve. But some chat participants are primarily phone users and don't often sit down with a laptop or desktop. Almost all of the links they post are these elongated strings that take up 4+ lines in the chat box and contain all sorts of Facebook references and other bits and pieces of the "path" they took to get to the actual article.

I've mentioned it before but usually just get groans about how I'm paranoid/anal retentive. It's probably no big deal but on principle I don't like clicking on anything that essentially lets Facebook or some other network track the spread of links even when I'm not on Facebook. If I'm actively on that site or something similar, sure, they can go right ahead and track what I do but when I'm just sitting at work and want to click some news story a friend forwarded, I don't need all the referral links and tracking.

The problem is (as has been mentioned above) a lot of people just don't think about what's in a URL and sorta just look at it like "that mess of gibberish text that points to the article or funny picture". The problem has only gotten worse as more people are primarily phone/mobile app users.


Just my knee-jerk opinion, but absolutely not acceptable. I can't see any way in which this could be construed as "expected behavior."


I don't think google search URL's were ever designed to be shared nor linked directly so I don't really see an issue here.

The original query is stored in case you want to track back on the auto-suggest would you rather they to store those in cookies or non-volatile browser storage?

And while it's true that if you link some one an entire Google URL search URL you might leak the original query, but given all the other dark magic that is in the URL which can include your client type, version, location and other things (e.g. inter-service referrals if you go there from another google service like Google Maps) i think the original query is the least of your worries.

And if you want to link a query well just use https://www.google.co.uk/search?q=how to use google works just fine.


> I don't think google search URL's were ever designed to be shared nor linked directly...

This is incredibly silly. URLs are how you get to a page. It's how The Web works!

> The original query is stored in case you want to track back...

That's what the History API is for. When you add an anchor-tag query, store the original state in the tab's history.

> i think the original query is the least of your worries.

Infoleaks are BAD. The least Google could do is use their URL rewriting powers to remove the original query when they add a query stored in the anchor-tag.


> This is incredibly silly. URLs are how you get to a page. It's how The Web works!

That's a very general and simplistic view of things, and we both know it's not true. While URL's were designed originally to only provide an address for a specific resource we all know they are also used for storage, user state management and many other things, again not ideal but in some cases unavoidable.

Not to mention that URL's are not implicitly intended to be shared outside of the scope of your application you wouldn't link the 3 mile long URL that you get every time you login into your bank account would you?

> That's what the History API is for. When you add an anchor-tag query, store the original state in the tab's history.

Technically (best kind of not true) not true for the Google History API and even if it would be applicable for this it doesn't work in cases where you for example weren't signed in or explicitly blocking google tracking API's and services.

Not to mention that making a API call every time you need to do something as basic as backtracking on your search is just pure insanity when it comes to resourcing.

And if you are talking about the Chrome Page History API then why in god's name would i want a site to access it, not to mention it's not applicable in this case either.

> Infoleaks are BAD. The least Google could do is use their URL rewriting powers to remove the original query when they add a query stored in the anchor-tag. Life is bad, if this was in a link which was automatically shared (although all those pin URL generators link 10 fold more data than Google) I would agree with you, otherwise not really this isn't a use case for the application.


> ...we both know it's not true.

I know no such thing. When I want to share a search result or anything else -that doesn't require authentication- with a friend, I copy the URL in the address bar and paste it to them. This is how The Web works.

> ...if this was in a link which was automatically shared... I would agree with you [that this is a bad thing.]

So, it would be acceptable to you if Google put your Google Account login and password in URI-encoded plain text in the URL of your search results? Why or why not?

> Not to mention that making a API call every time you need to do something as basic as backtracking on your search...

1) Google already does exactly what I described when you make a search on *.google.com with a Javascript-enabled browser. It's what lets your back button work with their JS-based page updates. ;-)

2) A person driving his User Agent doesn't use the History API to go to a previous search result. He uses his back button or equivalent key combo.


Putting your password in plaintext is unacceptable because exposes it to anyone who may be sniffing your traffic. Your search history is already visible to anyone sniffing your traffic, so it's presence in the URI is moot.

I guess my point is that there's a continuum here. There's already other ways for this data to leak (it's already plaintext on the wire, and it's already there in your browser history) and this particular attack vector is easy to mitigate if you put just a little effort into it.


> Putting your password in plaintext is unacceptable because...

I know that. :)

dogma1138 said:

> ...given all the other dark magic that is in the URL which can include your client type, version, location and other things (e.g. inter-service referrals if you go there from another google service like Google Maps) i think the original query is the least of your worries.

There are people for whom and situations in which having a search query exposed is utterly disastrous. I asked the question I asked in order to determine if dogma1138 held the opinion that

* "URL's are not implicitly intended to be shared outside of the scope of your application" [0], and therefore any sensitive information in them is 100% okay, because -in his world- noone ever shares hard-to-read URLs anyway

or

* URLs can have potentially disastrously sensitive information in them, as long as it's not username and password

or

some other opinion.

In short, this was an information-gathering question designed to test the bounds of a hypothesis that I was forming, but did not yet have enough information to put any faith into. :)

[0] https://news.ycombinator.com/item?id=10467919


    > Your search history is already visible to anyone sniffing your traffic
http://google.com/ normally redirects to https. It's possible to access Google over HTTP, but it's not typical.


> Putting your password in plaintext is unacceptable because exposes it to anyone who may be sniffing your traffic. I thought query strings are encrypted in a https request.


>This is how The Web works. This is one of many ways the web works, URLs were designed way before there was an easy way to share them outside of the scope of the website it self.

Looking at my history 50-60% of the URL's in it aren't humanly readable and are 10 miles long, many of them do not require authentication it would be nice if people would adhere to URL's are meant to be humanly readable and easy to share but you know... ...life.

>Google already does exactly what I described when you make a search on *.google.com with a Javascript-enabled browser. It's what lets your back button work with their JS-based page updates. ;-)

No it doesn't my back button works just fine and there isn't a single request made when i click it, just tried it know both with the network parses in chrome dev tools and fiddler.

And again the history API isn't used for this not to mention it has been deprecated and turned into the App Activities API which has quite a different set of functions now, moments are still supported but it's pretty weird now. https://developers.google.com/+/history/ https://developers.google.com/+/features/app-activities

Yes info leaks are bad, but this isn't even an info leak if you start a completely new query it resets the original query variable

e.g.:

You search for query

you'll get ?q set to query and ?oq set to query

you search for query history it will set ?q to query history and keep ?oq set to query

you search i got a rash on my butt it will reset both the ?q and ?oq URL parameters to the new query because it's unrelated.

So you won't leak the fact that you searched for manliness enhancing spa treatments to your friends if you link them a query of a pole dancing cat. Your honor is safe my friend.


I notice you didn't address my question:

"So, it would be acceptable to you if Google put your Google Account login and password in URI-encoded plain text in the URL of your search results? Why or why not?"

This is actually a pretty important question.

> Looking at my history 50-60% of the URL's in it aren't humanly readable and are 10 miles long...

It doesn't matter how long the URL is. It doesn't matter if you can easily read it. The Web works through URLs. URLs (and URIs) are how you access resources.

* Data in the href attribute of the a tag? A URI.

* Data in the src attribute of the img tag? A URI.

* Data in the src attribute of the script tag? A URI.

* Data in the 200 response to an HTTP GET request? A URI.

* Data in the 30[1|2] HTTP response? A URI.

* Data in the address bar of your User Agent? A URI.

> [T]he history API... has been deprecated and turned into the App Activities API...

No. It has not.

> ...my back button works just fine and there isn't a single request made when i click it...

Right. That's what the History API does. It's a strictly client-side thing. I guess you're dreadfully confused. Here's [0] the first result for "History API". It also happens to describe exactly what I'm talking about.

> this isn't even an info leak if you start a completely new query it resets the original query variable ... you'll get ?q set to query and ?oq set to query ...

> you search for query history it will set ?q to query history and keep ?oq set to query

Cannot repro. Here's what I see:

* Use omnibox to search for "thing": o=thing&oq=thing

* Use google search page that loads with results for "thing" to search for "things": o=thing&oq=thing#q=things

* Use that same page to search for "dingus dongus": o=thing&oq=thing#q=dingus+dongus

Chrome 46, using the default Omnibox settings.

[0] https://developer.mozilla.org/en-US/docs/Web/API/History_API


> in some cases unavoidable

That something is sometimes true is not an argument why it automatically is true in a particular case.

That's what google gives me by default:

https://www.google.com/search?q=bill+hicks+about+marketing&i...

Yet removing the crap works perfectly fine, so how is having that crap unavoidable?

https://www.google.com/search?q=bill+hicks+about+marketing

https://www.google.com/search?q=bill+hicks+about+marketing&s...

In contrast, out of the gate:

https://duckduckgo.com/?q=show+me+some+sanity&ia=definition

It's a pity it redirects to add "&ia=definition", but it's nearly pretty. And if this didn't blow your mind yet, if you right-click a search result to copy the URL, you get the actual URL with DDG, not an abomination of an URL that redirects you via Google to the URL you thought you copied (and which, to add insult to injury, gets masked when hovering over the link in the status bar, like a malware page might do it).

Sure, that's "unavoidable" if you want to track more stuff, but wanting to track more stuff in itself is not unavoidable. It'a choice.

> Not to mention that making a API call every time you need to do something as basic as backtracking on your search is just pure insanity when it comes to resourcing.

That's exactly what Google does though. It would be perfectly possible to let stuff be cached in the browser at least for a few minutes, but there is always requests being made. They throw away those user and server resources to track hits. It's a bit like YouTube used to buffer and allow skipping back and forth without having to load one more byte, and now doesn't. There's ads to display and usage statistics to gather. Which you might argue they need that to "improve their product", but I think looking at, thinking about and using a product can get you a long way, too. If you need to see how real people use it, pay them to use it while you record them. Yeah, I know, it would ruin everything and we'd be left with Pong, right? Baby food manufacturers could also improve their products if they could experiment on thousands of babies without any restrictions, and I bet if that was the norm, any reform would be met with visions of doom about toxic baby food.

Right now, nothing companies like Facebook, Google, Apple and Microsoft could invent would be nearly as cool as a simple, efficient web with users educating other users how to use it responsibly. If the web was plumbing, we'd be looking at people making all sorts of weirdly shaped pipes out of fancy materials, and people would sell their plumbing skills by talking about their religion and family, instead of time and materials needed, and capacity and stability of the result. We're still at the level of people burying slaves at the foot of a new aqueduct, and I can't wait until this hysterical gold rush is a mere footnote in the history of an actual information age. Right now we live in the marketing age, and the mediocrity is a direct result IMO.


[deleted]


Nope, you can modify everything except the origin: https://developer.mozilla.org/en-US/docs/Web/API/History_API


Yes they can, with `pushState`.


Workaround:

Change your omnibox search engine query to https://www.google.com/#q=%s (chrome terminology)


This. It not only fixes the article's complaint, but removes all the other metadata stuffed into the URL and just plain looks nicer.


I'm curious what's contained in the actual links in the results, and what happens when I share it with friends:

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&c...


"...curious to hear HN's thoughts on if this should be fixed or is acceptable."

I consider it unacceptable. Why does Google do this? Is it for the purpose of tracking users?

This feels like a common problem with URLs. More and more sites send you onto other links while apppending some identifier to the URL. I frequently come across links shared by friends and colleagues where I can immediately see the referral site that was the source of the link (even though the destination of the link is different). Presumably this is yet another layer to the endless (but rarely questioned) tracking that has become the norm on the web.


In my opinion this is acceptable. It's probably even functional: the content of the first search might very well affect the results of the second search.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: