Subject: Re: Data science, data analytics using XSLT streaming
From: Ihe Onwuka <ihe.onwuka@xxxxxxxxx>
Date: Tue, 5 Nov 2013 10:41:35 +0000
|
On Tue, Nov 5, 2013 at 10:12 AM, Costello, Roger L. <costello@xxxxxxxxx> wrote:
> Hi Folks,
>
> Apparently "data science" is the hot buzzword these days:
>
> Data Scientist: The Sexiest Job of the 21st Century (http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/)
>
> I think that, in a nutshell, data science is about analyzing large amounts of data.
>
No it's not. The data don't necessarily have to be large. Shorn of
that prequisite almost any form of computation entails analyzing data.
> It seems that most people believe that the Hadoop, parallel processing paradigm is the sole way of doing data science/data analytics.
>
No they don't. First up Hadoop is not the paradigm it MapReduce is.
Hadoop is just an open source project that implements the paradigm.
> However, I think that streaming is an equally valuable approach.
>
> XSLT streaming is all about processing large amounts of (XML-formatted) data.
>
But just because XSLT just got it doesn't mean it is new.
> So XSLT streaming should fit in the "data science" and "data analytics" categories.
>
If the source data is in XML then it is useful for extracting data and
handing it off to an environment properly equipped with primitives for
requisite statistical analysis.
> Broad Question: Would you provide a scenario/example of doing data science/data analytics using XSLT streaming please?
>
> I realize that the question is rather vague and broad. I am hoping we can collectively come up with ideas on how to do data analytics (data science) using XSLT streaming. Any ideas you might have would be appreciated.
>
See the previous answer.
| Current Thread |
|
Ihe Onwuka - 5 Nov 2013 10:41:45 -0000 <=
Craig Sampson - 7 Nov 2013 21:33:31 -0000
|
|