How does that work with logs? Logs are often... Huge? How many lines of logs can...

akiselev · on April 20, 2023

That log line (with the four space at the front for HN formatting) is 40 tokens [1]. You can easily fit several hundred log lines with GPT4 8k context and with the incoming 32K context, you'll be able to fit close to a thousand log lines.

That's a lot of context, especially if you can prefilter from relevant services, nodes, etc. or provide a multi-node trace

[1] https://platform.openai.com/tokenizer

skrebbel · on April 20, 2023

Yeah but… i can look through 100 log lines manually faster than writing a good gpt prompt. This would get useful if I can easily paste like 1M lines of logs (a few minutes of data for us), but even if that would work, it’d be prohibitively expensive I think.

In other words, I still don’t completely grok the use case that’s being shared here.

akiselev · on April 20, 2023

The use case here is looking through logs for software that aren't familiar, especially stuff that gets touched too infrequently to internalize like, say, driver logs on a Linux workstation.

If it's faster for you to read the logs yourself, you should continue to do that. If it's bespoke personal or commercial software, chances are GPT isn't going to be trained on its meaning anyway.

Most people aren't going to be familiar with arbitrary ACPI errors. Most people would have to Google or ask GPT to even understand the acronym.

fendy3002 · on April 20, 2023

To add on it, sometimes adjacent logs can help to find solution, which by using the conventional google way, you'll need to analyze those yourself. With gpt, you don't need to, or they're helping you navigate on it.

skrebbel · on April 20, 2023

Right! Thanks, this is the bit I missed.

iso1631 · on April 20, 2023

Being able to pump a firehose of syslogs at a GPT and tell it to flag any problems would be great

BeefWellington · on April 20, 2023

Thousands of log lines is actually pretty tiny though. I have some verbose logging in a testing lab and JUST network traffic from a few mobile devices can easily throw out several megs per hour in logs.

s3p · on April 20, 2023

200 lines @ 40 tokens per line equates to 8,000 tokens. That costs $1.60. for one query.

tharkun__ · on April 20, 2023

Someone making $100 per hour would be able to analyze those 200 logs for about 7 minutes for that cost. That's a person you pay over 200k of salary.

I analyze logs at work all the time and I analyze way more than 200 lines and it takes way less than 7 minutes to analyze those 200 lines.

Somehow I don't think it would be more cost effective to have an intern paste those into ChatGPT over and over and blow through a ton of money doing it.

qup · on April 21, 2023

$100 / hour = $1.67 per minute

7 minutes = $11.69

tharkun__ · on April 21, 2023

I guess I "forgot to carry the 2". Thanks! (as in, where did I get the 7 min from - no I don't remember)

yokoprime · on April 23, 2023

Without even trying to check the math, I can tell you this is wrong. Cost is way way less. With 8k context i havent even been close to 1USD a day even with huge prompts. Yes i have API access to GPT-4

mjrpes · on April 20, 2023

For GPT-4 input cost is 3 cents per 1,000 tokens (8K context). So input cost for 8,000 tokens is 24 cents. Output is 6 cents per 1,000 tokens, but answer will hopefully be a lot shorter.

waboremo · on April 20, 2023

Why wouldn't you use a basic filter on logs before even reading them (or passing them elsewhere)? Maybe even get ChatGPT to write that for you if you're so inclined but it can just be a simple command as grep works fine.

After that, even if it's still a massive file, chunking it to ChatGPT should work within its limits (although I haven't personally used it for logs so I can't recommend this).