Project

Half of the points (30 points) are allocated to the course project. The aim of this project is to polish the skills that you have obtained during the course and to demonstrate your abilities to apply the deep learning technique in a business context. You will go over some steps of the streamline analysis project: (1) finding an interesting question, (2) obtaining the data, (3) building the deep learning model.

When building the project you should bare in mind the following principles:

  • feasibility: you should an interesting but at the same time feasible question to answer. For instance, recognizing fake emotions would be an overkill, while distinguishing between chihuahua or muffin or clasifying a genre of a song would do the trick.
  • quality over quantity: the project should be concise. It is better to address one concrete, simple, and clear question rather than having a huge list of interesting but unanswered questions.
  • precision: first off, you should precisely describe your model (e.g., which activation functions are used, what is the accuracy based on the test set, etc.)
  • richness: you should experiment with different models, i.e. it is not sufficient to provide one model and make a bold statement that this is the best one. Fine tuning is required.
  • applicability: the project should be somehow applicable to industry or academia.
  • Sky is the limit: you can come up with any ideas of your preference. You are welcome to participate in Kaggle competition and provide your kernel as the project. You can get huge inspiration from various blog posts on deep learning.

Project proposal

Deadline of the project proposal: 2021-05-19 2021-05-23 at 11:59 p.m.

Closer to the end of the semester, you would already know what you can do with deep learning models. With this machinery, you can already figure out a nice and interesting research question (not necessarily an academic one). You are asked to provide a detailed plan on how to proceed with your project, i.e., how are you planning to acquire the data, which models are you planning to test, what is the expected result, etc. There will be no presentations dedicated to the project proposal, but you need to fill in this Google Form. Please complete this form once you are sure about your idea.

Each section should take no more than 4 short sentences. Please try to be as concise as possible. The fields of the form are as follows:

  • Introduction: describe the context of your question/problem. Related questions are below:

    • Why is it interesting? What is the motivation for doing this project?
    • How your solution or answer would improve the current state?
  • Research question/conjecture/problem/objective: describe precisely the formulation of the question. Related questions are below:

    • What exactly is the outcome of your model?
    • What kind of problem is it? Classification? Regression?
  • Previous analysis: describe the previous approaches on how this issue was tackled. Related questions are below:

    • Is there a traditional approach?
    • What models (i.e., CNN) were used for this question?
  • Data: describes the way how you are planning to get data. Related questions are below:

    • Is there an available dataset (e.g., on Kaggle or KDnuggets) you can use? Does it require any scraping? Is it freely available?
    • What is the size of the data?
  • Methodology: describe which models you will experiment with. Related questions are below:

    • Which deep learning architecture you are planning to use? CNN with transfer learning and fine-tuning?
    • Does it require remote/cloud computing?

Note: The project proposal should be submitted only once. There will be no project proposal update. Once you have submitted your project proposal, you must finish it up. Think about it as a contract in soft terms.

Examples of the project proposal from the previous year can be found via this link.

Project

Deadline of the project report: 2021-06-02 2021-06-09 at 11:59 p.m.

As it is explained on Submission page, you will be given access to the GitHub repo. In this repo, in the folder \reports you will find a file project_report.rmd that should describe your project in detail (do not forget to knit this file into .html). The report is supposed to have similar sections as your project proposal. However, you will have to elaborate on each paragraph.

If your data is not exceeding a certain threshold (1 GB per repo, 100 MB per file), you can upload it in the repo (\data folder). Otherwise, you can use Google Drive. Further, you have to save your final model in the folder \results.

When uploading your project, you should always think about portability (will it run on any machines?) and reproducibility (can my analysis be recreated?).

Project presentation

Presentation day: 2021-06-03 2021-06-10

You have to present your project on June 3, 2021. The schedule will be announced in the end of May. The presentation should take five-ten minutes and be accompanied by slides. Each of the team members should present, ideally, 50% of the time (equal split). The presentation will be followed by a short round (around 2 minutes) of questions.

The focus of this activity is on the content of your presentation. Even though this course is not about creating a nice presentation, it should be noted that the quality of the presentation (and the level of preparation) will influence your final grade. You should be able to communicate your results to others.

Inspiration