nitter
Hal Pomeranz
@hal_pomeranz
15 Apr 2021
Complex query run over 4.5T of CloudTrail data-- jq + xargs (for parallelism) banged it out in under 4hrs
#CommandLineSkillz
Apr 15, 2021 · 10:03 PM UTC
2
12
Hal Pomeranz
@hal_pomeranz
15 Apr 2021
Replying to
@codeslack
650K or so
Hal Pomeranz
@hal_pomeranz
15 Apr 2021
Replying to
@codeslack
Uncompressed files on local disk. -P 64 on a 72 core system.
2
Hal Pomeranz
@hal_pomeranz
16 Apr 2021
Replying to
@codeslack
Oh yes, the disk situation is definitely a bottleneck. I could see the load and I/O wait shooting up.
J P
@JPoForenso
15 Apr 2021
Replying to
@hal_pomeranz
Highly recommend checking out Athena to query CT logs from S3. Exponential time savings and relatively cheap (as compared to using other services/methods). Even better time & cost savings w/ partition mgmt.
docs.aws.amazon.com/athena/l…
workshop.aws-management.tool…
wellarchitectedlabs.com/secu…
1
Hal Pomeranz
@hal_pomeranz
15 Apr 2021
Well hello, Mr Fancypants! :-) For reason's that don't bear discussing, I'm dealing with logs exported as JSON on disk.
1
more replies
Joe Ianni
@jianni20
16 Apr 2021
Replying to
@hal_pomeranz
1
GIF
Hal Pomeranz
@hal_pomeranz
16 Apr 2021
Yeah I’m tempted to do a short course (in my copious spare time)
Tim Vidas
@tvidas
16 Apr 2021
Replying to
@codeslack
@hal_pomeranz
You should write fortune cookie inserts...