Tuesday, June 7, 2016

Data Mining Public Data - Curve Aliasing

One of the things that most scientists take for granted is when data is all nice and organized and you can just work.  What if you need to build up that list of curve aliases?  Here is a workflow I just went through to build up my alias list for triple combo curves.
  1. Download 30 gigs of LAS from UTLands
  2. Write Python code to read each LAS and save curve name, units and description to SQL table
  3. Query SQL for curves that have keywords in their description
My little table of logging curves for this one data source is just over 270k rows.  Neat.  

When going through the curves there were enough instances with descriptions I didn't need to query the SPWLA mnemonic search.  If a curve was in the list that I didn't recognize I could quickly query all instances of that curve and look at the available descriptions.  

I wonder what I could learn by mapping out various header information?  Stay tuned.

No comments:

Post a Comment