Anyone doing data science/analysis as career/hobby?

Anything to do with the traditional world of get a degree, get a job as well as its alternatives
Post Reply
daylen
Posts: 2538
Joined: Wed Dec 16, 2015 4:17 am
Location: Lawrence, KS

Anyone doing data science/analysis as career/hobby?

Post by daylen »

I plan on getting a job in the data science/analysis space, but I have been doing data analysis for fun as well. I know making it a job will most likely change my perspective, but I can still see myself utilizing the knowledge/skills/methods of data analysis later in life for a variety of unforeseen reasons.

Anyone here make use of all the open data sets and statistical + machine learning techniques in order to make better decisions, predict the future, or just for fun?
Last edited by daylen on Mon Jun 06, 2016 9:51 pm, edited 1 time in total.

numbersmom
Posts: 17
Joined: Tue May 22, 2012 1:51 pm

Re: Anyone doing data science/analysis as career/hobby?

Post by numbersmom »

I'm a data scientist for an automotive company in the Detroit area. It's great fun. Tons of data to play with and all kinds of problem solving. There is some corporate stuff that comes along with working for a large company but I just ignore most of it. Data science is so new in the automotive industry that the few of us that are doing it are pretty much defining our jobs.

I worked my way into this line of work after 15 years as a stay at home mom. I have lots of information on resources if you are interested.

We are down one data scientist in our group so if you live in the Detroit area, let me know!

jacob
Site Admin
Posts: 15980
Joined: Fri Jun 28, 2013 8:38 pm
Location: USA, Zone 5b, Koppen Dfa, Elev. 620ft, Walkscore 77
Contact:

Re: Anyone doing data science/analysis as career/hobby?

Post by jacob »

I used to work in quant finance. I had access to the best market data in the world and I built my own tools. It was fun. Probably the only corporate problem I can think of/experienced is that many still don't understand what data science is and can do, so your results/analysis can seem like magic (good) or hocus-pocus (bad) to them.

In terms of personal benefits, I think data science is intellectually exciting. I have added a few data science ideas to my latticework/way of thinking which is also great. Compared to many other older fields, most of which rely on analysis, the field is a rich source of creative/novel ideas.

However, I'm not doing number recognition for fun in my sparetime :) ... is kaggle still running well?

FBeyer
Posts: 1069
Joined: Tue Oct 27, 2015 3:25 am

Re: Anyone doing data science/analysis as career/hobby?

Post by FBeyer »

So in the vein of every discussion on stats vs Machine Learning vs Data Science: how/why are you all discriminating between the 'three disciplines'?
I'm looking to call myself a data scientist in two years, but on one really seems to agree on what a DS does, nor how one becomes one.

Stats, programming, software design, and coming up with solutions to odd problems tickle me in all the right (strictly professional) ways though so Data Science seems the way to go.

Any help and resources on DS and how to market oneself as a data scientist is much appreciated.

almostthere
Posts: 284
Joined: Tue Jul 09, 2013 1:47 am

Re: Anyone doing data science/analysis as career/hobby?

Post by almostthere »

I spent my first non-working year really getting into DS. I eventually got to be invited to be community TA for the Johns Hopkins R courses. I was never as advanced as the other TAs but I could help the complete newbies. I have since changed focus to fundamental stock research. I think I was just doing DS b/c I wanted to play with market data.
I can now code well enough create simple tools for my stock picking endeavours (like scraping yahoo for example or plotting trends) and my statistics knowledge is only at the level of descriptive statistics. That said, even my neophyte knowledge has completely changed my world view. I would guess that I will dabble in these fields for the rest of my life. I just like the way I can think now that I can code and I understand basic stats.

jacob
Site Admin
Posts: 15980
Joined: Fri Jun 28, 2013 8:38 pm
Location: USA, Zone 5b, Koppen Dfa, Elev. 620ft, Walkscore 77
Contact:

Re: Anyone doing data science/analysis as career/hobby?

Post by jacob »

@FBeyer - I don't know if it's well defined yet, but here's how I see it.

statistics: Old-school methods designed to make reasonable conclusions based on the limited amount of data points available in most of the 20th century. So frequentist stuff like probabilities, t-tests, chi2-tests, etc.

data mining: A descriptive/passive way to look at big data. Usually involves visualization. Anything with clusters. SVMs. PCA. Since data mining has certain derogatory implications, I would never refer to myself as a "data miner" :-P

machine learning: An active way to look at big data. Usually implies making predictions. Decision trees. Neural nets.

data science: all of the above

daylen
Posts: 2538
Joined: Wed Dec 16, 2015 4:17 am
Location: Lawrence, KS

Re: Anyone doing data science/analysis as career/hobby?

Post by daylen »

numbersmom wrote:I have lots of information on resources if you are interested.
Resources is the one thing I have enough of at this point. I need to get my hands dirty. :)
numbersmom wrote:We are down one data scientist in our group so if you live in the Detroit area, let me know!
That would be awesome! But I do not graduate for a year. :(
jacob wrote:is kaggle still running well?
Yes, it is very successful and growing.
jacob wrote:statistics: Old-school methods designed to make reasonable conclusions based on the limited amount of data points available in most of the 20th century. So frequentist stuff like probabilities, t-tests, chi2-tests, etc.

data mining: A descriptive/passive way to look at big data. Usually involves visualization. Anything with clusters. SVMs. PCA. Since data mining has certain derogatory implications, I would never refer to myself as a "data miner" :-P

machine learning: An active way to look at big data. Usually implies making predictions. Decision trees. Neural nets.

data science: all of the above
Just to add to Jacob's answer.

statistics: ...inferential, sampling

data mining: ...extracting insights from big data

machine learning: ...sub field of artificial intelligence. Makes use of cheap computation.

data science: All of the above and more...domain knowledge, data wrangling, data collection, data presentation, etc

Also, statistics, machine learning, and data mining have significant overlap (regression in statistics and machine learning, clustering in data mining and machine learning, sampling in data mining and statistics, etc).

FBeyer
Posts: 1069
Joined: Tue Oct 27, 2015 3:25 am

Re: Anyone doing data science/analysis as career/hobby?

Post by FBeyer »

The derogatory term is data dredging. I've honestly never seen anyone but you use the term data mining in any other sense than loosely coupled with unsupervised methods.

I'ts funny how people see those distinctions between disciplines; I've taken stats courses since November and statisticians do all of the above. If there truly is a difference, then I reckon it's because industry pigeonholes the statisticians they hire and print something different on their business cards, otherwise they can't tell them apart. :roll:

Gilberto de Piento
Posts: 1949
Joined: Tue Nov 12, 2013 10:23 pm

Re: Anyone doing data science/analysis as career/hobby?

Post by Gilberto de Piento »

What's the difference between a data analyst and a data scientist?

About $20,000 per year. :)

Tyler9000
Posts: 1758
Joined: Fri Jun 01, 2012 11:45 pm

Re: Anyone doing data science/analysis as career/hobby?

Post by Tyler9000 »

I certainly would not describe myself as a data scientist, but I guess I'd qualify as someone who enjoys number crunching as an active hobby. The website has been a fun outlet, and I've learned a lot in the process of making new tools.

daylen
Posts: 2538
Joined: Wed Dec 16, 2015 4:17 am
Location: Lawrence, KS

Re: Anyone doing data science/analysis as career/hobby?

Post by daylen »

@FBeyer

Traditionally statistics is lacking in making use of big data. Statistics developed as a way to extract insight from small, quality data rather than large, messy data. The distinction between statisticians and data scientists is made to distinguish between two different allocations of mostly the same skills. This is important in order to make the workforce more efficient. For instance, if a hiring manager wishes to hire someone with programming skills that match their statistics knowledge, then a data scientist is a better fit than a statistician since the majority of statisticians have intermediate programming skills. Classifying everyone who works with data as a statistician is inefficient in the workforce.

@Gilberto de Piento

Same concept as above. In general, data scientists make more use of advanced skills such as machine learning.

BadHorse
Posts: 55
Joined: Sun Jun 05, 2016 4:17 am

Re: Anyone doing data science/analysis as career/hobby?

Post by BadHorse »

I'm also somewhere on the DS spectrum ;)

In addition to the different kinds of data analysis, I would say that data science also involves actually getting hold of the data. Finding, storing, combining, and prepping the data for integrated analysis. At least in my field.

numbersmom
Posts: 17
Joined: Tue May 22, 2012 1:51 pm

Re: Anyone doing data science/analysis as career/hobby?

Post by numbersmom »

I have been having a great time playing with all the data at my company.

But doing real data science is not anything like working with a Kaggle competition dataset. Real data is dirty, messy, hard to access and not well understood even by the people using it. Before I started working at this company, every dataset I analyzed was already in a clean nice format in a csv or excel file in a manageable size. The data I use sits in a hadoop cluster that I access using sql queries. I found that all of my coding skills (R and Python) were up to par for the job but definitely not my sql skills. There are online training classes that you can take but there is no way to really get good until you are dealing with an actual database. I routinely access datasets that are anywhere between 20 and 50 million rows. And I had to learn how to access the database through R and Python.

In the process of accessing and analyzing the data, my team and I have identified problems with the way the data has been used for years. This is probably the biggest eye opener that I have experienced. Every time we identify a problem, changes have to be made to all of the source datasets which in term makes changes in our models.

Jacob said "Probably the only corporate problem I can think of/experienced is that many still don't understand what data science is and can do, so your results/analysis can seem like magic (good) or hocus-pocus (bad) to them. " This is definitely true in our case. It's hard to get funding for projects because the upper level managers approving the funding don't understand and can't believe there is a good ROI.

The other major difference between the real world and a Kaggle competition is that once you construct a good model, you must be able to visualize the results for your client. In a Kaggle competition, it's all about the score, but a business customer doesn't understand/care about that. They want to be able to see a picture that explains how everything works. Visualization is tricky and not always as intuitive as the customer wants. It takes a lot of patience to educate the customer to see the value of the model.

daylen
Posts: 2538
Joined: Wed Dec 16, 2015 4:17 am
Location: Lawrence, KS

Re: Anyone doing data science/analysis as career/hobby?

Post by daylen »

The thing about kaggle is that the data sets are boring to me. I would much rather ask interesting questions first, then find data sets online that could potentially shed light on the problem. One caveat with this approach is that you are on your own for the most part; unless of course others have attempted to answer the question before (in which case, don't reinvent the wheel). This method most likely is not the most efficient way to learn since there isn't much of a feedback loop, but it is the most enjoyable method in my opinion.

FBeyer
Posts: 1069
Joined: Tue Oct 27, 2015 3:25 am

Re: Anyone doing data science/analysis as career/hobby?

Post by FBeyer »

BadHorse wrote:I'm also somewhere on the DS spectrum ;)

In addition to the different kinds of data analysis, I would say that data science also involves actually getting hold of the data. Finding, storing, combining, and prepping the data for integrated analysis. At least in my field.
Any and all advice you have for joining the statistics/analysis circuit in Copenhagen: SPILL IT! :mrgreen:

BadHorse
Posts: 55
Joined: Sun Jun 05, 2016 4:17 am

Re: Anyone doing data science/analysis as career/hobby?

Post by BadHorse »

@numbersmom

That’s exactly my experience too.. big data is big time messy.
I’ve been away from serious SQL for a while but will need to build a new database soon. Hope it all comes back to me (of course I’m also tempted to just go ahead and build a noSQL one, simply because I don’t know anything about it).

@FBeyer

Heh, based on your posts, you are already doing more stats than I am! My job uses rather specialized tools ( I won't give personal job info here, but feel free to PM if you are interested in the details).

daylen
Posts: 2538
Joined: Wed Dec 16, 2015 4:17 am
Location: Lawrence, KS

Re: Anyone doing data science/analysis as career/hobby?

Post by daylen »

@BadHorse

NoSQL is a whole different paradigm from regular SQL. Look up relational vs non-relational data bases. It should be fairly clear which to use for your specific use case.

BadHorse
Posts: 55
Joined: Sun Jun 05, 2016 4:17 am

Re: Anyone doing data science/analysis as career/hobby?

Post by BadHorse »

@daylen

That much I do know :)
I've just never had the chance to use noSQL in practice, so it's tempting to find an excuse.. even if SQL really is better suited. Some day I'll get to it.

daylen
Posts: 2538
Joined: Wed Dec 16, 2015 4:17 am
Location: Lawrence, KS

Re: Anyone doing data science/analysis as career/hobby?

Post by daylen »

@BadHorse

Yea, I have been meaning to setup sql,nosql servers and mess around with them for the education. Haven't gotten around to it though. :?

Post Reply