Data Science. TileDB. Open Source. Quant Research. R. C++. Debian. Linux. Adjunct Clinical Professor, University of Illinois. Lots of coffee. And some running.
To unlock the world’s research papers for computerized analysis Carl Malamud built and released what he calls the ‘General Index’.
107 million full-text-searchable scientific papers. 38 terabyte.
It can be queried for any five-word snippet within it.
nature.com/articles/d41586-0…
Extra thanks also to @pdalgd for helping out as my real-time editor spotting a stoopid typo in the first (now deleted) run of this tweet. #RStats users can execute
> fortunes::fortune(112)
to illustrate.
Big Thank You! to @Sondreus for a fab guest lecture on 'Data Science at @TheEconomist' yesterday to STAT 447. A focus on 'coding and modeling' plus open source repos at @github with rich data sets well processed and analysed yield great results! #Rstatsyoutube.com/watch?v=1TgEl5OZ…
"[E]conomists are often fumbling in the dark, with too little information to pick the policies [...] The world is on the brink of a real-time revolution in economics, [...] Big firms [...] already use instant data to monitor ..."
economist.com/briefing/2021/…
[1/*] If we take the chance that a tool (compiler, linker, batch, whatever) remains working for a particular codebase after one year as a given probability p, then the chance that build remains working after x years is p^xn, where n is the number of tools used in the build.
You need business calendar logic. This is finance packages (for market open/closed). RcppQuantuccia has them in a lighter-weight package (and I'll have an update really soon), else RQuantLib. Or check exchanges / national bodies. #rstats
I show that trick in STAT 447 (which got a lecture on sed and awk added this year) and have a few StackOverflow answer showing that 'pattern', here is one:
stackoverflow.com/a/23198223…
So #rstats gets better with #awk too :)
Congrats! The @Debian package has been updated and is already available; an @Ubuntu 20.04 build will be published in 'edd/misc' PPA shortly. Also the most recent #RStats package release RQuantLib 0.4.14 installs unchanged.
There are other ways to pass values to an Rstats script; I like `docopt` my @edwindjonge *a lot* for this. You could pick up values from a config file you alter, or from a parameter store like Redis, or ...
But keeping 18 variants of the same source file is not not good.