Hacker Newsnew | past | comments | ask | show | jobs | submit | minimaxir's favoriteslogin

I recently asked Opus to just “Add vector search” to my current hobby project, a topic I know very little about. It set up manticore, pulled an embedding model, wrote a migration tool for my old keyword indices, and built the front end. I’m not exaggerating much either: the prompt was the length of a tweet.

I think it would easily have taken me 4+ hours to do that. It ran in 15 minutes while I played Kirby Air Riders and worked on the first try.

Afterward, I sort of had to reflect on the fact that I learned essentially nothing about building vector search. I wanted the feature more than I wanted to know how to build the feature. It kept me learning the thing I cared about rather than doing a side quest.


http://database.ul.com/cgi-bin/XYV/template/LISEXT/1FRAME/sh...

Sorry for the hideous URL, but Anker absolutey does have UL certified power supplies. To my knowledge they're CCC compliant as I believe they're sold in China and as far as I cant tell, that's a requirement. TÜV I'm not so sure about, but I was under the impression they were more of an 3rd-party way to get CE certified (which Anker products already are).

I get being cautious about power supplies - but Anker is legitimate.


I'd argue that bzip2 is a better example of a compression algorithm which no one needs anymore.

Considering these features:

  * Compression ratio
  * Compression speed
  * Decompression speed
  * Ubiquity
And considering these methods:

  * lzop
  * gzip
  * bzip2
  * xz
You get spectrums like this:

  * Ratio:    (worse) lzop  gzip bzip2  xz  (better)
  * C.Speed:  (worse) bzip2  xz  gzip  lzop (better)
  * D.Speed:  (worse) bzip2  xz  gzip  lzop (better)
  * Ubiquity: (worse) lzop   xz  bzip2 gzip (better)
So, xz, lzop, and gzip are all the "best" at something. Bzip2 isn't the best at anything anymore.

Anyone else noticed that when editing a notebook in VS Code, one could lose unexecuted edits to a cell with accidental arrow key input (that defocuses the current cell and erases changes)? Too many times I have lost a paragraph worth of Markdown text midst typing, it’s kind of maddening.


What are you using in your projects?

Is TF the dominant tool in commercial or startup DNN projects?


I think there's a middle tier of problems that don't need a distributed cluster but can still benefit from parallelism across, say, 30-40 cores, which you can easily get on a single node. Once you know how to use Spark, I haven't found there's much overhead or difficulty to running it in standalone mode.

I do agree in principle that you're better off using simpler tools like Postgres and Python if you can. But if you're in the middle band of "inconveniently sized" data, the small overhead of running Spark in standalone mode on a workstation might be less than the extra work you do to get the needed parallelism with simpler tools.


Semantic recently adopted my team's React adaptation as their official React port. It's lighter weight, eliminates jQuery, and all components are standard React components that can be extended or dropped in as-is.

https://react.semantic-ui.com/


Becoming a data scientist on your own is exceedingly difficult because, despite their purported adherence to objective data above all else, the practice of data science is full of people who consistently appeal to authority via educational credentials. You can see it in this thread. They regularly make the mistake of thinking that because the skills necessary to be successful in the field correlate highly with advanced degrees that means that only people with advanced degrees should be able to participate in it. They generally make it very difficult to objectively evaluate an individual's skills because their injection of bias into the candidate evaluation process.

Its regressive and completely out of step with the supposed meritocracy we like to think we follow in tech. Its also the path towards cartels. I get the feeling a large portion of data scientists would like to create the American Data Scientist Association, with credentials and bar tests.


Algorithmia founder here. nvidia-docker is helpful but does not address all the issues with running GPU computing inside of docker. There are driver issues on the host OS, and the real challenge is running multiple GPU jobs inside of separate docker containers and sharing the GPU.

I agree that building models is still definitely a big challenge, but the tooling and knowledge is getting better every day. Either way, our goal with Algorithmia is to create a channel for people to make their models available, and create incentive for people to put in the effort to train really solid, useful models.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: