Complex query run over 4.5T of CloudTrail data-- jq + xargs (for parallelism) banged it out in under 4hrs #CommandLineSkillz

Apr 15, 2021 · 10:03 PM UTC

2
12
Replying to @codeslack
Uncompressed files on local disk. -P 64 on a 72 core system.
2
Replying to @codeslack
Oh yes, the disk situation is definitely a bottleneck. I could see the load and I/O wait shooting up.
Replying to @hal_pomeranz
Highly recommend checking out Athena to query CT logs from S3. Exponential time savings and relatively cheap (as compared to using other services/methods). Even better time & cost savings w/ partition mgmt. docs.aws.amazon.com/athena/l… workshop.aws-management.tool… wellarchitectedlabs.com/secu…
1
Well hello, Mr Fancypants! :-) For reason's that don't bear discussing, I'm dealing with logs exported as JSON on disk.
1
Yeah I’m tempted to do a short course (in my copious spare time)
You should write fortune cookie inserts...