Excellent succinct descriptions. I'm curious what your thoughts are on SOX compliance or any other "distinct record" compliances in these Baleen architectures? More and more, my colleagues and I are required to meet these data quality / compliance standards in ELT products.
There is an inherent difficulty in meeting these standards as you alluded to when writing,
"Like late binding on procedure calls across services, ELT is much more adaptive and accepting of differences and evolution. Also, like late binding, it may not perfectly match the semantics but rather give “good enough” answers."
There seems to be a tension between the privacy standards and the desire to scoop up anything and everything to get more knowledge.
You are legitimately questioning if the looseness fits acceptably into the compliance concerns that demand clear and crisp provenance for the data. It is a big concern.
Another related area is the provenance of training data. If Department-A and Department-B within Company-X both contribute training data for machine learning, what happens when Company-X divests itself of Department-A? Is it now legal to use the training data derived from commingled knowledge from Department-A and Department-B? I don't have any answers but I do have questions.
Excellent succinct descriptions. I'm curious what your thoughts are on SOX compliance or any other "distinct record" compliances in these Baleen architectures? More and more, my colleagues and I are required to meet these data quality / compliance standards in ELT products.
There is an inherent difficulty in meeting these standards as you alluded to when writing,
"Like late binding on procedure calls across services, ELT is much more adaptive and accepting of differences and evolution. Also, like late binding, it may not perfectly match the semantics but rather give “good enough” answers."
Hey, Wesley!
There seems to be a tension between the privacy standards and the desire to scoop up anything and everything to get more knowledge.
You are legitimately questioning if the looseness fits acceptably into the compliance concerns that demand clear and crisp provenance for the data. It is a big concern.
Another related area is the provenance of training data. If Department-A and Department-B within Company-X both contribute training data for machine learning, what happens when Company-X divests itself of Department-A? Is it now legal to use the training data derived from commingled knowledge from Department-A and Department-B? I don't have any answers but I do have questions.
Hi Pat. Some minor notes:
REST is for "Representational State Transfer", not "Representational State Transformation".
https://en.wikipedia.org/wiki/Representational_state_transfer
The term "shredding" seems to be overloaded. The term is also used to indicate "secure delete":
https://en.wikipedia.org/wiki/Shredding
Yeah... I screwed up. It's a bit too late for me to fix it in the published articles, though. Sigh...
Thanks for helping me know better!! I try to be careful but I do mess up.
Thank you for your attention and interest!