ScrubJay is a framework for automatic and scalable data integration. Describe your datasets (files, formats, database tables), then describe the integrated dataset(s) you desire, and let ScrubJay derive it for you in a consistent and reproducible way.
ScrubJay is used at our HPC center to help analysts evaluate operations with data collected from every component. More news about this innovative project is on the way from LLNL’s science and technology magazine… stay tuned!