I have had the opportunity to explore distinct datasets while completing my Data Analyst nanodegree with Udacity along with other great courses on Edx, Coursera, Udemy and ICE. I will try to post them all here.

Projects

  1. WeRateDogs Twitter
  2. This document presents the main findings of the exploratory analysis carried out on data from the Twitter account WeRateDogs. The employed datasets contained information regarding the stated stage of the dog, a prediction of their breeds and all the information of the tweets where they were featured. This information was obtained from official and trusted sources. These datasets were obtained from csv files, URL's using ther requests library and by using the Twitter API. The main question to be inspected with the data was, the relationship between the dog’s stages and the favorite and retweet count.

    Conclusions and the whole analysis can be found here.

  3. Explore Weather Trends
  4. The introductory project required the manipulation of a relational database hosting world temperatures. At first the dataset was plotted as such and the resulting chart seemed volatile.

    A second one was plotted using an average window of 20 years and compared with global temperatures

    The notebook used can be found here. but here I have listed some good data source and competitions.

  5. Factors influencing movie revenue and ratings
  6. The dataset chosen was the TMDb movie data. The original collection holds detailed information of approximately 10,867 movies within 21 columns including information related to their casting, producer, budget, runtime, and so on. The properties contained in the dataset are quite diverse. Some of these variables are names (actors and producers names), others are ordinal data (ratings), sentences (tagline) and even paragraphs (overview). The data covers several decades, and it had the revenues of older years adjusted so as to be comparable to current ones. Questions that directed this introductory investigation were:

    • How the runtime and budget influence the ratings a movie receives?
    • How the runtime and budget influence the box office performance of a movie??
    These two will be inspected so as to answer a third question:
    • Are the box office performance and the ratings related?
    During the exploratory tasks everal charts were made in order to answer these questions:

    Conclusions and the whole analysis can be found here.