"Just Get Us the Data, and We'll Take it From There"If you read my previous article, you know I am borderline obsessed with the phrase "there has to be a better way". There are plenty of examples of Big Data tools utilizing common communication frameworks and protocols. The real effort in this process is how does an application owner pull out that data and submit it to the Big Data solution with minimal and effective efforts. Many times, I see organizations put the effort of outputting the data on developers to either write REST interfaces into the application, or even worse (from a performance perspective) write out the data to log files. Both of these efforts end up solving the problem, but could introduce security issues along side the fact you are asking a developer to implement a brand new component into the solution for one off requests. Try this exercise:
- Think of a question you would like to ask your application
- Write down all of the metrics or points of data required to come up with an answer
- Picture all of the individuals required to get at each one of those points of data
- What if one of those individuals defines a metric differently, what would be the impact to the answer?
- Can anyone maliciously use this data if accessed?
This is just the topics you need to cover for answering one question. The only way to prevent this from becoming unbearable, is to get to the same results using a different path.