Show HN: No one cares about observability costs

13 points | by julian-datable 3 days ago ago

8 comments

rco8786 3 days ago
One of the persistent challenges I run into in this area, is that any sort of up front filtering/routing requires you to know in advanced which logs are going to be important when an issue happens. Which is sort of impossible. And nobody wants to be the guy that filtered out some logs because they looked useless and then only later on realize they would have been instrumental in getting back up and running quickly.
[-]
- julian-datable 3 days ago
  One of the biggest problem we hear about from CISOs is 'they don't know what they don't know' - meaning they need a way to catch all the data. This plays pretty directly into your comment - there's a need for wanting everything, but a penalty for having everything - slower queries, expensive, more false positives, slower time to resolution.
  What's common as a middle ground is blob storage and rehydration - where you send everything into low cost storage like S3 while still peeling off the high value data into the SIEM / Datadog / etc. Then if you notice something is amiss, you can rehydrate the time window you care about.
jbiggley 2 days ago
Kudos for being self-aware and acknowledging that solving the problem which you saw doesn't always translate into solving the problem that potential customers want to pay you to solve.
One of my favorite talks [0] speaks to the problem with thinking that telemetry is valuable just because it is [logs|metrics|traces].
Alerts/notifications etc. are an attempt to distill something useful from something that is abundant. From Cribl's `About Us` page [1] -- \ ˈkribəl \ - “An instrument with a meshed or perforated bottom generally used for gold panning in order to strain valuable material from discardable matter.”
[0] https://www.youtube.com/watch?v=qTf5pli3qRU
[1] https://cribl.io/about-us/
sporkmonger 3 days ago
Quite a few years ago, I led a migration off from a legacy logging provider that offered little more than full text search over unstructured text.
Logging at the time was somewhere in the ballpark of 1% of our total common infrastructure spend and widely acknowledged as too expensive relative to the minimal value we got from it with that rudimentary feature set, but it also was nowhere near enough cost to justify doing something about it. We had other observability costs that dwarfed it.
What finally justified the overhaul was that security couldn’t really operate usefully on log data unless we pulled the data out somewhere else like Athena and processed it there. That slowed down security incident response times dramatically.
The migration ultimately benefited the whole engineering organization but it had to be security led to get any traction.
achempion 3 days ago
Who are security teams? At which size company hires those? Is having a security team driven by copmliance to get certaion certificates that required by vendoes? Did the product you built addressed the observability need, if so, why it wasn't used much?
philstephenson 3 days ago
Are you pitching or complaining? Because I can't tell.
[-]
- julian-datable 3 days ago
  Mostly aiming to share my anecdote about being too close to a problem for others to learn from, and to pitch - to validate the security thesis.
jktzes 3 days ago
[dead]