Adam for Kite here. We learned a lot over the past year and have worked hard to build several new features to Kite based on detailed user feedback (much of which came from the HN community). We're excited to release two new features of special importance today:
* Line-of-Code Completions - Kite's completions engine can now predict several tokens of code at a time, powered by the most sophisticated AI code models available.
* Cloudless Processing - Kite now performs all processing locally on users' computers, instead of in the cloud. No need to upload your code to our servers. You don't even have to sign up for a Kite account.
We know privacy is a big concern for users, so that's why we decided to bring Kite off the cloud. You can learn more about this decision as well as the full Kite release on our blog post, linked here.
Our core belief is programmers spend too much time on repetitive work like copying and pasting from StackOverflow, fixing simple errors, and writing boilerplate code. That's why Kite uses AI to make writing code less repetitive and more fun.
Speaking of fun, we've set up a playground for you to try our Line-of-Code Completions out in your browser. We hope you enjoy it!
As always, Kite is free to download and use. And we no longer require user accounts now that we've moved off the cloud.
If you already use Kite (thank you for your support!), you now have these features via auto-update.
We're really looking forward to your feedback. The detailed feedback we've received in the past has been immensely helpful in getting us to this point. We'll be here all day to answer questions, too.
Additionally, we collect anonymized "heartbeats" that are used to make sure the Kite app is functioning properly and not crashing unexpectedly. These analytics are just simple pings with no metadata, and as mentioned, they're anonymized so that there's no way for us to trace which users they came from.
We also use third party libraries (Rollbar and Crashlytics) to report errors or bugs that occur during the usage of the product.
2. What we don't collect
* Contents (partial or full) of any source code file that resides on your hard drive
* Information (i.e. file paths) about your file system hierarchy
* Any indices of your code produced by the Kite Engine to power our features - these all stay local to your hard drive
How do you guys plan on monetizing this? This list of analytics seems fine at a quick glance, but I'm concerned that there's not a transparent path to profitability here.
Just like any other service, there's no guarantee that you won't start collecting snippets of source code or other metadata to start selling once the VC's start applying pressure to generate income. What's your strategy?
> * Line-of-Code Completions - Kite's completions engine can now predict several tokens of code at a time, powered by the most sophisticated AI code models available.
Won't this result into copy-paste-like functionality of StackOverflow? This seems great for small projects, but doesn't look useful at for anything else it seems. I'm skeptical that Kite provides anything of value compared to all of the other tools I have.
The website has mentioned that other languages "are coming soon", though it has said this since the last Kite scandal. Are these just empty words?
But cookies that save your search queries are personal data and require opt-in? You can consider anything not to be "personal data". I still really want this to be opt-in, not opt-out.
Here’s some feedback: pretending you’re grateful for users screaming at you for breaching their trust is not a good way to regain that trust.
> If you're already a user, Kite has been auto-updated and is now working locally without sending code to the cloud. If you previously uploaded code to our servers, you can remove your data via our web portal.
> We're grateful to our users for helping us reach this point.
Why aren’t you scrubbing all surreptitiously uploaded data proactively?
Why are you pretending you are glad you got caught doing sketchy things?
You’re going to have to do a lot better than that to convince us that your company suddenly operates with integrity now simply because you got caught.
TabNine is a different approach that uses far less semantic information than Kite. It really shines when you're talking about syntactic repetition, e.g. if you check out the screenshots on their homepage. They have a page about semantic completions, but the semantics are very shallow -- basically what attributes are on the instance you're accessing.
In contrast, we've spent ~50 eng-years semantically indexing all the code on Github, building statistical type inference, and rich statistical models that use this semantic information in a very deep way. The result is that Kite can help more often, in ways that reflect a deep understanding of the semantics of the code you're writing.
Interesting question. I’m not a lawyer, but I presume it’s derivative enough not to be copyrighted. For example, one could argue Google’s giant n-gram corpus is derived from many copyrighted webpages.
> Our core belief is programmers spend too much time on repetitive work like copying and pasting from StackOverflow, fixing simple errors, and writing boilerplate code.
Do you have data backing this core belief up or is it just an intuition? I don't feel any of these things are problems I need a solution for. Maybe this depends on the type of programming someone does or the language they use but as a working programmer this is not a compelling sales pitch to me. You're offering solutions to problems I don't have. They're also not problems that I see to any great extent in junior programmers who report to me.
On an intuitive level, I would guess there are many, many developers debugging a "you forgot to cast your int to a string" error message as I'm writing this.
More concretely, the StackOverflow thread for "Parsing values from a JSON file?" has almost 3 million views. We take this as evidence that programming is repetitive.
(We should probably clarify that when we say 'repetitive' we mean on a global scale, not an individual scale.)
> I would guess there are many, many developers debugging a "you forgot to cast your int to a string" error message as I'm writing this.
I've always avoided dynamically typed languages because of these types of issues so I guess this seems like a solved problem to me - just use a statically typed language. Clearly there are many people who are using dynamically typed languages for whatever reason though so I can see how better tooling might be useful to them.
> More concretely, the StackOverflow thread for "Parsing values from a JSON file?" has almost 3 million views.
I certainly make use of Stack Overflow, I was taking issue with the idea that significant time is spent copying and pasting code from the site though. I often refer to it to answer a programming related question but I almost never then copy and paste code. I'm looking for a pointer to a library or API, to understand some confusing or poorly documented behavior or for a workaround to a bug etc. and getting an answer to my question rarely leads to copying and pasting code in my experience.
I am using dynamic languages and have for years, but I don't see this as something that would help me in my work. I mean, autocomplete is nice sometimes, but when I know what to write and where, those little autocomplete boxes just get in the way. Maybe I'm also not the target audience though. Which is good - after reading how they started I would never willingly install their software.
There's a trend amongst dynamic languages to slowly add type hints (see Mypy for Python, Typescript, etc) so I see these issues becoming less important over time.
I've been using Python on a daily basis for close to a decade now, I can probably count on my hand how many times a bug has been because of casting a string to an int.
"Our core belief is programmers spend too much time on repetitive work like copying and pasting from StackOverflow, fixing simple errors, and writing boilerplate code."
Sounds like you will likely be introducing bugs into people's code. Or at the very least, regressing code quality towards the mean.
Maybe, but these days you have programming languages/libraries with minimal documentation and underspecced parameters/returns, you have to hunt for examples of use on StackOverflow/Github/etc. and many frameworks/tasks have annoying boilerplates everybody has to type all the time.
Moreover, there are common errors you can avoid if their ranking algorithm discourages code reported/identified as bad.
Kite aside, I think both of these things will happen as developers adopt these technologies. We're seeing similar effects with autocorrect on your phone and self driving cars (i.e. there will still be accidents, and they tend to drive more slowly), but on balance these technologies are positive.
For me autocompletion is the second thing I turn off on a phone (right after vibrate on touch). There is nothing more annoying than writing a word correctly, and autocomplete "fixing" it, just because it was slang word, or a word in another language because I forgot the english word in that moment.
A lot of that is because the minimal phone keyboard doesn't give you a good UI to accept or decline a suggested autocompletion.
With a keyboard and mouse and large monitor, you can display the completion prompt and let the user select or ignore from the various options with Ctrl-Space or similar.
When "Select the first autocomplete over what I typed" is the default behavior, yeah, that sucks. I often end up using the "key combo" Space,Backspace,Space to accept the autocomplete on my phone.
But I definitely see utility in offering a correction for, eg. 'DateTime.Format("yyyy-mm-dd" -> yyyy-MM-dd' or any of the million other idioms that we have to remember or look up. What I don't understand is how Kite can offer useful local-only suggestions - Is the default dictionary that comes with it preloaded with a million lines of open source, or is it literally just my code being suggested?
N-grams is the first thing you try when you want to statistically model code. We tried it in 2014 and the results are disappointing, if you want to do anything beyond identifying low level syntactic patterns.
The intuition here is that in natural language, context is defined locally. If you want to know whether a 'they' refers to a male or female, for example, you look at the nearby text. In contrast, the flow of data and control through code is highly non-local, which is a lot of why techniques like N-grams don't work.
This has been a very active area of research in academia since we started Kite in 2014. (Suggested google searches: [big code], [ml on code], etc.)
All that said I don't have data on how our approach would compare to N-grams, but I'm guessing if you look at some of the academic research, you'll find that NLP techniques were abandoned, despite the early papers' focus on them.
This is excellent! Thanks for allowing me to play with it!
Did you publish any papers on how you approach intelligent code completion at Kite (the parts you can make public or just rough areas I should look into? I understand you have many emerging competitors). Thanks!
And that isn't even the full story. After Kite was forced to come clean about their Atom extension purchases, they were caught with yet another unrelated Sublime Text plugin that was phoning home to an IP address for an entire year. Their explanation? They forgot they were doing it!
Their only redeeming move could be to open source their project, and build a business model around that. Otherwise I don't see how will they ever gain back our trust, since they have repeatedly demonstrated a willingness to deceive users.
> I have to wonder if your goal was to upset enough people that you'd generate real attention on various news sites and get Kite a ton of free publicity before your next funding round,” @DevOpsJohn wrote
From what I remember the biggest criticism was that they uploaded your code to the cloud, but if you read this, they say that it's now done entirely locally. So honest question, what would the risks be at this point?
What should compel anyone to trust them in the first place? Seems like the balls in their court to figure out how to regain trust.
edit:
>When we started Kite, we were excited about the possible benefits of internet-connected programming. We were also well aware of the privacy and security concerns some people would have.
So their approach is going to continue being to ignore what they did and pretend they're cool now. Alright, well you can download the maybe-malware if you want, I'll stay as far away as I can from this company myself.
I feel like the answer to your question is implicitly that there is nothing they can do. they're "cancelled". Even a personal apology (with supporting documentation) would be ignored, since you don't care about the product anyway. Is this true?
I mean, I don't necessarily disagree with you, they have to regain trust, but at some point you have to be open to it.
No, but they need to explain what went wrong to cause them to think what they did was okay, and why we should believe it won't happen again, at a minimum. If they have no integrity, and hijacking an open source project certainly is sending that message, then I can't imagine what other incentives would stop them from abusing my trust in them for their gain.
I would like something like this, but they need to address this more thoroughly. It's not just a footnote.
>Trust but verify.
What does that actually mean? If I verify, then did I trust?
I don't want to verify, but if they expect us to be doing that work for them they sure as hell need to open source it.
tl;dr Kite hired a developer for an Atom plugin and promoted themselves in this plugin. Kite acquired an autocomplete plugin and switched the engine to use their own.
IMO, the devs have the freedom to do whatever they want with the code they maintain. If the users don't like it someone can fork and maintain their own version.
>Is a $4 million venture capital-funded startup stealthily taking over popular coding tools and injecting ads and spyware into them?
Ads? Yea, I guess you can call cross product promotion an ad, but it's far from flashing banners up for sale on ad exchanges. Spyware? Hardly. A service uploading data and responding with results is a perfectly legitimate interaction.
In a corporate environment this may lead to accidential violations of Contracts. Usually code isnt allowed to leave the corporate network. So one day poor joe updates his favourite autocomplete plugin, the next day he violated a contract
How come JetBrains didn't raise bazillion VC dollars, while their code completion and tools like refactoring are the best option for most popular programming languages?
Jetbrains has put a lot of engineering effort into building really great tools. They're really wonderful!
They took more of the Atlassian path of no/low-VC and slower organic growth starting in the early 2000's, so they've had a lot of time to build the "snowball" effect.
They've even put Jetbrains' IDE at the bottom of the "Staircase of intelligence" in their little infographic...
I'm not sure if developers familiar with those tools can take this company seriously (especially given what they've done previously.)
My primary experience is with Intellij, though most of their IDE's behave similarly.
Im not sure what advantage I'd get with kite.
Often I'd start using a class in the code, and then ask intelliJ to import it, rather than going to the top, and use the autocomplete tool. It can do patterns as well, even custom defined ones, with default values (https://www.jetbrains.com/help/idea/using-live-templates.htm...)
They are from Prague (Czech Republic). While not SV, I can assure you they could easily find VCs there (or abroad, maybe in Russia?) if they wanted to.
Not everyone's goal is to build with investors' money. Sometimes it is easier to succeed without it.
" I can assure you they could easily find VCs there "
It's not a question of 'easy or hard' it's a question of 'easier or harder'.
Being in Czech definitely makes it harder, hands down. Thankfully, they are in the EU, but it's still a completely different land that many VC's might not be remotely willing to touch. Consider that very few can even read the laws.
Czech startups would be limiting themselves to German and possibly Russian money in general, and it'd be harder to raise in the US.
JB is a great company that might be able to 'easily' raise money, but even then there could be 'red lines' for many firms that just make investing there impossible. For example, a firm may require they re-incorporate in London or Luxembourg or something, where there are far better established commercial laws.
I know many European startups raising money from American VCs.
Open Delaware C-corp and hire European programmers remotely. Either via consulting agreements or wholly-owned European subsidiary. Our company (tensorflight.com) did that.
After fundraising with both European and American VCs, indeed the VC market in the USA is much better.
I acknowledge that this will be unpopular, but ...
Those of us who are fossil grumpuses already think IDEs often allow people to write code they don’t think enough about with the idea that issues will be caught by someone else in code review. This, at least, is what I’ve observed over the last few years.
Something that writes the code for people and people basically “code” by doing a series of micro-code-reviews seems really crazy to me for any application that isn’t just fluff. Just look at what autocorrect has done to average incorrect-words-per-sentence. One of the problems with predictive text generation in general is that in isolation the output can seem very sensible even if it’s gibberish.
So as an IDE skeptic in general, I’d be very curious to try this tool out, if only to see how they deal with that.
[I spent years and years using VC++ and other tools and it was actually this feeling of not really knowing anything that drove me away from it. Etags/Cscope/Grey/actually-reading-code was what I replaced it with..]
> Just look at what autocorrect has done to average incorrect-words-per-sentence.
The problem with autocorrect is that it changes what you've written after the fact, without user input. This on the other hand is autocomplete, so if it suggests gibberish, you still need to explicitly accept it. If you do that, chances are you would've written gibberish in the first place.
As pointed above by the other comment, autocorrect's issue is that it sometimes forces a correction on you (unless you disable that). On the other hand, there are many words I could never dream of spelling correctly without looking up, but that I know get perfectly. Sadly, you don't notice those, but only notice it when it goes wrong.
Similarly, imagine going to type a common piece of code such as `if __name__ == '__main__':` or `def __init__(self):` at the start of a class, and have it automatically suggested. Obviously you can manually create snippets for each one of these, but that's much more effort.
I can see an interface similar to Gmail's new smart completion, or fish shell, that just shows you a suggestion in the background.
Gmail smart completion is more akin to a template, like giving you an empty class or an empty main function. Its utility (at least in my experience) hasn't gone much farther.
The example you link (from Zsh, not fish) is a fancy looking history autocompletion: in bash, just press Ctrl+r.
The parent has a point: I've spent time working with absolute beginners in programming and the first thing I was teaching them was to ignore the IDE "smarter autocompletion" and suggestions. For typing faster, sure; for suggesting anything else, not so clever.
Just a correction, my example was from a zsh plugin which imitates a built-in fish feature. Also, said feature is different from ctrl+r, which all 3 shells still have.
This one suggests possible completions as you type, so it's a passive feature, compared to ctrl+r which is more active and requires explicit action to work.
And yes, I agree that the whole purpose is to write faster, not to write smarter. I'd maybe add cleaner too, because unless you auto format your code, people often don't write the best by default.
In all my years programming, there's been only one feature that I've seen that I consider The Killer Feature of Autocomplete, and I've only seen it in one editor: Emacs.
I don't want to complete with all the other code everybody else in the world has written. I want to complete with the words I wrote in a comment 3 lines ago. Or my README I've got open in another window. Or the JSON config file I was editing. That's my litmus test: can it autocomplete from all my other text? Kite can't.
I played with this online demo, and I actually found it pretty frustrating. It kept trying to replace what I was writing with snippets that other people, apparently, had written. I guess that could be useful when trying out a new library, but the rest of the time, it's just going to get in my way.
We'll monetize through the enterprises. We've had lots of conversations with larger companies and it's clear that engineering time is really precious to them, as is shipping faster. We're not really sure how to mechanize charging enterprises yet. When we had a freemium offering we didn't like that e.g. students didn't get all the features. Github had an elegant approach with their approach of charging for private repos. Maybe Kite is paid if you're working in a repo that's not open source and has lots of active contributors. Would love any ideas!
You've raised millions and you're spitballing monetization ideas on an internet forum?
Look, I'm all for second chances here. The past behavior of this company doesn't concern me as much as the pretty clear lack of strategy around how you plan to make money.
So this is what I've gathered so far (and correct me if I'm wrong on any of this):
- free product (based on the website)
- no serious mention of enterprise pricing or support (based again on the website)
- a promise in your TOS not to collect sensitive data (which almost all companies tend to modify, let's be honest)
- solicitation of monetization ideas on an internet forum by the CEO, having raised millions from VCs
This sounds like a product I'd stay away from if I cared about data collection.
Can you quantify how much money you're saving developers or companies who use this?
> Line-of-Code Completions - Kite's completions engine can now predict several tokens of code at a time
I can predict several weeks of stock prices... not very well.
Shannon had an estimation method: ask people to guess the next letter in text, to find its information content. (Assuming people have a perfect model of text - probably, today, with billions of samples, machines might be better?). Could do this with program text, to bound the benefit.
loc completion is a great idea, might work well with idioms, especially if it can figure out the likely parameterization (e.g. in a for(;;) loop). I reckon this approach will no where near realize its promise... but will serendipitiously reveal unexpected adjacent benefits.
Also reminds me of that joke tool that automatically finds and pastes Stackoverflow code.
Can you also add a feature where my code is analyzed by a community/individuals for $/month if I wish to submit it? Sometimes as a dev, you get stuck or need to do refactoring & need help. Glitch has a community feature like that - but it would be amazing to build a team of experts paid to unstuck fellow devs - on top of a tool like kite; esp because you can then use that data for building a better suggestion engine.
> Kite doesn't support Linux yet, but coming very soon.
Quick suggestion - can you support a Docker based install ? Vscode now support remote debugging through docker, etc. I'm not sure about your architecture, but im wondering if this isnt something that you can do
How would that work? Docker is based on container technology inside the linux kernel. Docker on mac and windows uses virtual machines to provide a linux kernel. If Kite doesn't support linux, how would their technlogy work inside a container, which is an isolated linux environment?
well Kite's server side is presumably linux (most people's are). Pretty sure that they had to re-engineer specifically for Windows and OSX on their native APIs.
Instead, only engineer for Linux and make it available through Docker on all platforms. Since VSCode exposes hooks to interact with Docker anyway, this might make more long term sense.
Seemed lackluster to me. Adding code in a literally defined data structure failed to complete anything.
with open() as f:
ll = [f.(no help from here on out)]
Yes, because we use deeply semantic information about code there is some engineering work to build support for other languages.
So we took the approach of focusing on one demographic (Python developers) and making them really happy. We think we've reached that point, so we're excited to begin looking at how we can expand our reach. Stay tuned!
While I agree with your sentiment that the changes were intrusive, from the article they weren't actually ads in the strict sense. No Ugg Boots or Rolex Replicas. They were helpful links to knowledge articles or documentation, from what I gather.
Could be wrong though.
That's gotta be a tough position to be in, having raised ~ a lot of money and having the pressure to do a suboptimal move to please the investors.
* Line-of-Code Completions - Kite's completions engine can now predict several tokens of code at a time, powered by the most sophisticated AI code models available.
* Cloudless Processing - Kite now performs all processing locally on users' computers, instead of in the cloud. No need to upload your code to our servers. You don't even have to sign up for a Kite account.
We know privacy is a big concern for users, so that's why we decided to bring Kite off the cloud. You can learn more about this decision as well as the full Kite release on our blog post, linked here.
Our core belief is programmers spend too much time on repetitive work like copying and pasting from StackOverflow, fixing simple errors, and writing boilerplate code. That's why Kite uses AI to make writing code less repetitive and more fun.
Speaking of fun, we've set up a playground for you to try our Line-of-Code Completions out in your browser. We hope you enjoy it!
As always, Kite is free to download and use. And we no longer require user accounts now that we've moved off the cloud.
If you already use Kite (thank you for your support!), you now have these features via auto-update.
We're really looking forward to your feedback. The detailed feedback we've received in the past has been immensely helpful in getting us to this point. We'll be here all day to answer questions, too.