Hello, there! I’m a little behind in the blogging department, so bear with me as I give a quick update on what’s been going down during these first two weeks of the bootcamp. Hopefully from now on i'll have at least one blog per week.
- Learned about what data scientists do on a daily basis and about the iterative design process, which is the crux of how data scientists solve problems:
- Figure out what the problem is you’re trying to solve (ask a lot of questions to get specific)
- Take the problem and brainstorm, as a group, any and all possible solutions
- Rank top options and prototype an idea through very basic sketches for the first three or so
- Take your top few ideas and go back to your client and show them what you’re thinking of doing for their problem
- Take the client feedback and repeat steps 2-5 until you the problem is solved and the client is satisfied
- We finished the first day learning about supervised machine learning while playing a game called Spot the Hipster. In groups of 3 we had to build a model (with no computer input) that would classify pictures of men as “hipsters” or “non hipsters”. My group did pretty well and came in second in the class, as we predicted 13 of the 15 new pictures accurately. Go team.
- Went over Git and Github and learned all about version control. Think of Github as a Dropbox/Google Drive type of site for coders and git as the process of how you upload and retrieve your code from the site. The combo allows multiple users to collaborate on projects without anyone’s code getting overwritten. Cool stuff.
- iPython notebooks are pretty great. They allow you to work on code in a browser-like environment and be able to run your code in nice little chunks instead of having to run your entire python file via the terminal
- Finally, we completed Project Benson!
- Reviewed best practices for python coding #pythonic
- Learned about web scraping using BeautifulSoup and Selenium
- Started brainstorming and coming up with idea of our next project, Luther
- Introduced to some of the top python statistical analysis packages with Pandas, Numpy, Scikit-learn, and StatsModels
- Reviewed Bayesian probability and linear regression
- The majority of our week was spent individually working on Project Luther. For this project we have to scrape Box Office Mojo (and whatever other sites we’d like) to come up with a movie related question and solve it using movie data and linear regression. My idea is to predict total Oscar awards won in a year given that a movie is a nominee for best picture. I’ll update my findings in the portfolio section after I’m done next week!
- It's incredible that two weeks have already past, every day goes by very quickly as there is always something to learn and do.
- Our instructors weren't kidding when they said that Googling is a real skill. You can't find help if you don't know what to ask for!
- WeWork is awesome because they give us free food.