<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.8.5">Jekyll</generator><link href="https://www.christianwerner.net/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.christianwerner.net/" rel="alternate" type="text/html" /><updated>2019-05-14T15:12:04+02:00</updated><id>https://www.christianwerner.net/feed.xml</id><title type="html">Christian’s Blog</title><subtitle>&amp;description &quot;Data, Science and Stuff...&quot;
</subtitle><author><name>Christian Werner</name></author><entry><title type="html">Data analysis for drug/ alcohol incidents</title><link href="https://www.christianwerner.net/other/Data-analysis-for-drug-narcotic-indicents/" rel="alternate" type="text/html" title="Data analysis for drug/ alcohol incidents" /><published>2019-01-12T00:00:00+01:00</published><updated>2019-01-12T00:00:00+01:00</updated><id>https://www.christianwerner.net/other/Data-analysis-for-drug-narcotic-indicents</id><content type="html" xml:base="https://www.christianwerner.net/other/Data-analysis-for-drug-narcotic-indicents/">&lt;p&gt;As part of the Coursera course &lt;a href=&quot;https://www.coursera.org/learn/data-results&quot;&gt;Communicating Data Science Results&lt;/a&gt; I want to present my assignment in this blog post.
The aim of the assignment is to analyze and visualize crime incident data for the cities Seattle and San Francisco.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Instead of the provided small subset (summer 2014) of crime data I used a full year worth of data from the original data sources (year: 2017):&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://data.seattle.gov/Public-Safety/Seattle-Police-Department-Police-Report-Incident/7ais-f98f&quot;&gt;&lt;em&gt;Full Seattle incident dataset&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://data.sfgov.org/Public-Safety/SFPD-Incidents-from-1-January-2003/tmnf-yvry&quot;&gt;&lt;em&gt;Full San Francisco incident dataset&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The actual jupyter notebooks are located at my &lt;a href=&quot;https://www.github.com/cwerner/crimeanalysis&quot;&gt;github&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;report&quot;&gt;Report&lt;/h1&gt;
&lt;p&gt;The following analysis reports on alcohol and narcotic related offenses as reported by the San Francisco and Seattle police departments for the year 2017. 	The most common offenses reported for San Francisco are “Larceny/Theft”, “Other offences”, “Assault”, whereas the Top 3 for Seattle are “Burglary”, “Car Prowl”, “Other Property”. However, it has to be noted that both data schema are not fully compatible and thus a class mismatch is likely (also, the summary classes are not disentangled). For the preprocessing carried out prior to this analysis and the individual code that produced these plots please see the jupyter notebook &lt;a href=&quot;(https://www.github.com/cwerner/crimeanalysis).  
Regarding alcohol and narcotics, these offenses (&amp;quot;Narcotics&amp;quot;,&quot; title=&quot;Driving under the Influence&amp;quot;, &amp;quot;Liquor Violation&quot;&gt;here&lt;/a&gt; rank on position 10, 20, and 23 (San Francisco) and 11, 22, 32 (Seattle), respectively (see Fig. 1).&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/crimes.png&quot; alt=&quot;Crimes&quot; /&gt;
  
    &lt;figcaption&gt;
      Fig.1: Total number of offenses for San Francisco and Seattle (year: 2017, classes partly matched)

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h2 id=&quot;seasonality-of-crimes-offenses&quot;&gt;Seasonality of crimes/ offenses&lt;/h2&gt;

&lt;p&gt;Next, the temporal occurrence of these offenses is investigated. As shown in Fig. 2 the number of recorded incidents varies over the weekdays (note that the number of narcotics incidents is scaled by a factor of 10 for visual reasons). It is obvious that alcohol related offenses are are most common at the weekends in both cities. Furthermore, narcotic reports on weekends are lower in both cities, too.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/weekday.png&quot; alt=&quot;Offenses per weekday&quot; /&gt;
  
    &lt;figcaption&gt;
      Fig.2: Drug/ alcohol related offenses per weekday.

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;This effect can also be clearly seen in a work-day/ non-work-day plot (Fig. 3). Here, the offenses were grouped by weekdays (non-work weekend ranges from Friday after 8pm till Monday 4am, US national holidays are also considered non-work-days). The number of incidents in non-work-day episodes is clearly higher for alcohol related offenses but about 20-25% lower for narcotic related offenses).&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/workdays.png&quot; alt=&quot;Offenses per work-/ non-work-day&quot; /&gt;
  
    &lt;figcaption&gt;
      Fig.3: Offenses split into work- and non-work-days

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;In addition to the weekly cycle, a diurnal variability of reports can be detected. In Fig. 4 the normalized number of offenses is plotted for 24-hour bins. It can be observed that “Driving under the Influence” is substantially higher from 8pm to 5am in both cities. “Liquor violation” also increased with day time (San Francisco), but the small number of incidents and the inconclusive trend for Seattle point at a schema mismatch for this class in this city. “Narcotics” incidents are dominant at daytime in both cities, with a small secondary peak from 8pm to 11pm in Seattle.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/hourly.png&quot; alt=&quot;Hourly number of incidents&quot; /&gt;
  
    &lt;figcaption&gt;
      Fig.4: Offenses split into hourly intervals

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;A classification in day/ night statistics (night: 9pm - 6am) further illustrates the strong difference in occurrence frequency for day and night hours (Fig. 5).&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/daynight.png&quot; alt=&quot;Offenses per day-/ night-time&quot; /&gt;
  
    &lt;figcaption&gt;
      Fig.5: Offenses split into day and night hours

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h2 id=&quot;location-of-crimes-offenses&quot;&gt;Location of crimes/ offenses&lt;/h2&gt;

&lt;p&gt;In addition to the temporal variability, the offenses also vary by their location. To illustrate this, incidents were mapped by their reported geographic coordinates.&lt;/p&gt;

&lt;p&gt;In Fig. 6 &amp;amp; 7 day and night incidents for the three offenses are mapped (light colors: day, darker colors: night). It is clear that “Narcotics” dominate in the CBD area of San Francisco, where as the other offenses are scattered more white spread.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/sanfrancisco_map.png&quot; alt=&quot;Location of offenses in SFO&quot; /&gt;
  
    &lt;figcaption&gt;
      Fig.6: Location of offenses in San Francisco

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;In Seattle, “Driving under the Influence” and “Narcotics” dominate in the CBD and the few reported incidents of “Liquor Violation” also occur in this area, although the small sample size makes a good interpretation difficult.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/seattle_map.png&quot; alt=&quot;Location of offenses in SFO&quot; /&gt;
  
    &lt;figcaption&gt;
      Fig.7: Location of offenses in Seattle

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;

&lt;p&gt;It was shown that the time and location of drug and alcohol related incidents varies strongly for both San Francisco and Seattle. However, often they do follow similar patterns. A more in-depth analysis is hampered by the different data structures and a more thorough feature mapping is required for more advanced analytics.&lt;/p&gt;</content><author><name>Christian Werner</name></author><category term="Data Science" /><category term="Visualization" /><summary type="html">A toy data analysis for drug/ alcohol related offenses recorded in San Francisco and Settle in the year 2017</summary></entry><entry><title type="html">Nick that formula like a Pro</title><link href="https://www.christianwerner.net/tools/Nick-that-formula-like-a-pro/" rel="alternate" type="text/html" title="Nick that formula like a Pro" /><published>2019-01-09T00:00:00+01:00</published><updated>2019-01-09T00:00:00+01:00</updated><id>https://www.christianwerner.net/tools/Nick-that-formula-like-a-pro</id><content type="html" xml:base="https://www.christianwerner.net/tools/Nick-that-formula-like-a-pro/">&lt;p&gt;Ok, this is going to be a shorty. I recently needed some nice formulas from a scientific paper for a &lt;a href=&quot;https://www.christianwerner.net/tech/Deep-Learning-for-Remote-Sensing-Applications/&quot;&gt;presentation&lt;/a&gt; and usually I try to just copy them from the source PDF since &lt;a href=&quot;https://www.apple.com/lae/keynote/&quot;&gt;Keynote.app&lt;/a&gt; is pretty good at incorporating high-quality PDF snippets into a slide.&lt;/p&gt;

&lt;p&gt;However, once you want to edit the formula (to match symbols or variable names) you have to tediously compose it by hand. For Keynote people like myself there is also the option since mid 2018 to just enter LaTeX or MathML code and have it render beautifully on the slides. For Word or Powerpoint folks adding formulas usually means using the clunky Microsoft Equation Editor. However, it seems Office 365 now supports to enter formulas using LaTeX syntax, too, but I’m still on Office 2016 here.&lt;/p&gt;

&lt;p&gt;Anyways, depending on your LaTeX and math skills bigger or more fancy equations still require you to put quite some effort into composing the actual code that gives you the nice formula.&lt;/p&gt;

&lt;p&gt;But fear not - with &lt;a href=&quot;https://www.mathpix.com&quot;&gt;MathPix&lt;/a&gt; this workflow is amazingly simple (not endorsed by the developers, I just discovered their tool recently and love it). Their page probably says it best: “Take a screenshot of math and paste the LaTex into your editor, all with a single keyboard shortcut.” Nice! Oh, and it’s free!&lt;/p&gt;

&lt;p&gt;The app used OCR to trace the equation and convert it into LaTeX formula syntax which you can paste into any editor or app that can deal with it. And that’s it really.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/mathpix.png&quot; alt=&quot;MathPix&quot; /&gt;
  
    &lt;figcaption&gt;
      MathPix: Take a screenshot of math and paste the LaTeX into your editor, all with a single keyboard shortcut

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h2 id=&quot;steps&quot;&gt;Steps&lt;/h2&gt;
&lt;ol&gt;
  &lt;li&gt;Grab the app from here: &lt;a href=&quot;https://mathpix.com/dmg/snip.dmg&quot;&gt;Mac&lt;/a&gt;, &lt;a href=&quot;https://mathpix.com/win_app/mathpix_snipping_tool_setup.exe&quot;&gt;Win&lt;/a&gt;, &lt;a href=&quot;https://snapcraft.io/mathpix-snipping-tool&quot;&gt;Ubuntu&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Open your source PDF&lt;/li&gt;
  &lt;li&gt;Hit Ctrl + ⌘ + M (Mac) or Ctrl + Alt + M (Win/ Linux)&lt;/li&gt;
  &lt;li&gt;Select the formula you want to extract by drawing a bounding box over it&lt;/li&gt;
  &lt;li&gt;Pick one of the suggested formats&lt;/li&gt;
  &lt;li&gt;Paste it into where-ever&lt;/li&gt;
  &lt;li&gt;Done&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;some-examples&quot;&gt;Some examples&lt;/h2&gt;
&lt;p&gt;Coming up some tests with various equations. The numbers indicate the source (1), the MathPix detection (2) and the final rendered equation in Keynote (3). The red marker indicates the equation used in the example.&lt;/p&gt;

&lt;h3 id=&quot;simple-stuff-a-sigmoid-function&quot;&gt;Simple stuff: a sigmoid function&lt;/h3&gt;
&lt;p&gt;No problem at all (not shown).&lt;/p&gt;

&lt;h3 id=&quot;more-advanced-regularized-loss-function-of-a-nn&quot;&gt;More advanced: regularized loss function of a NN&lt;/h3&gt;
&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/mathpix_clearly.png&quot; alt=&quot;Example Equation 2&quot; /&gt;
  
    &lt;figcaption&gt;
      An example from a scientific paper: a L1 loss function regularization (source: He &amp;amp; Yokoya (2018), ISPRS Int. J. Geo-Inf. 7, 389; &lt;a href=&quot;https://www.mdpi.com/2220-9964/7/10/389&quot;&gt;doi:10.3390/ijgi7100389&lt;/a&gt;)

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h3 id=&quot;advanced-but-low-quality&quot;&gt;Advanced but low quality&lt;/h3&gt;
&lt;p&gt;A quick test how a low res equation scan works out (go to source for actual resolution).&lt;/p&gt;
&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/mathpix_just.png&quot; alt=&quot;Example Equation 3&quot; /&gt;
  
    &lt;figcaption&gt;
      Some random equation from the web from a low quality scan (&lt;a href=&quot;https://math.stackexchange.com/questions/960831/covariance-of-states-of-a-finite-markov-chain&quot;&gt;source&lt;/a&gt;)

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h3 id=&quot;matrix-stuff-detection-ok-but-keynote-bails&quot;&gt;Matrix stuff: detection ok, but Keynote bails&lt;/h3&gt;
&lt;p&gt;MathPix manages to detect this formula but Keynote gives an error when rendering it.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/mathpix_notquite.png&quot; alt=&quot;Example Equation 4&quot; /&gt;
  
    &lt;figcaption&gt;
      A rotation matrix from wikipedia (&lt;a href=&quot;https://en.wikipedia.org/wiki/Rotation_matrix&quot;&gt;source&lt;/a&gt;)

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h3 id=&quot;too-hard-big-matrix&quot;&gt;Too hard: big Matrix&lt;/h3&gt;
&lt;p&gt;Ok, this definitively seems to be too hard for the little app - MathPix bails.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/mathpix_fail.png&quot; alt=&quot;Example Equation 5&quot; /&gt;
  
    &lt;figcaption&gt;
      Description of neural net weight (&lt;a href=&quot;https://medium.com/@erikhallstrm/backpropagation-from-the-beginning-77356edf427d&quot;&gt;source&lt;/a&gt;)

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;A really neat tool that stays in my menubar for sure!&lt;/p&gt;</content><author><name>Christian Werner</name></author><category term="Latex" /><category term="Science" /><summary type="html">If you ever struggled to replicate this tedious formula from a paper this nice tool might be for you</summary></entry><entry><title type="html">Deep Learning for Remote Sensing Applications</title><link href="https://www.christianwerner.net/tech/Deep-Learning-for-Remote-Sensing-Applications/" rel="alternate" type="text/html" title="Deep Learning for Remote Sensing Applications" /><published>2019-01-04T00:00:00+01:00</published><updated>2019-01-04T00:00:00+01:00</updated><id>https://www.christianwerner.net/tech/Deep-Learning-for-Remote-Sensing-Applications</id><content type="html" xml:base="https://www.christianwerner.net/tech/Deep-Learning-for-Remote-Sensing-Applications/">&lt;p&gt;Happy New Year everybody.&lt;/p&gt;

&lt;p&gt;I started this year off with a paper presentation for the &lt;a href=&quot;https://twimlai.com/meetup/&quot;&gt;TWiML&amp;amp;AI EMEA Meetup&lt;/a&gt; titled “Deep Learning for Remote Sensing Applications”. A brief intro to optical and radar-based remote sensing and a deep learning application for generating optimal images based on radar and previous time-step images by He and Yokoya (2018). It’s a nice application illustrating the use of CNN and cGAN and data fusion based on auxiliary radar images to generate optical images that might be compromised by clouds or missing-data.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/DL-for-RS_Meetup.png&quot; alt=&quot;Deep Learning for Remote Sensing&quot; /&gt;
  
    &lt;figcaption&gt;
      Deep Learning for Remote Sensing Applications

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;For your convenience the open-access paper discussed can be found &lt;a href=&quot;https://www.mdpi.com/2220-9964/7/10/389&quot;&gt;here&lt;/a&gt;, the slides can be downloaded &lt;a download=&quot;&quot; href=&quot;/assets/docs/Werner_DL-for-RS_TWIMLAI_Meetup_EMEA_030119.pdf&quot;&gt;here&lt;/a&gt;, and the recording of the meetup is provided &lt;a href=&quot;https://twimlai.com/meetups/deep-learning-for-remote-sensing-applications/&quot;&gt;here&lt;/a&gt; (talk starts at about 16:12min in).&lt;/p&gt;</content><author><name>Christian Werner</name></author><category term="Remote Sensing" /><category term="Deep Learning" /><category term="Presentation" /><summary type="html">Introduction to Deep Learning for Remote Sensing Applications based on the paper by He &amp; Yokoya (2018), Int J Geo-Inf.</summary></entry><entry><title type="html">Deployment for cheapskates</title><link href="https://www.christianwerner.net/tech/Deployment-for-cheapskates/" rel="alternate" type="text/html" title="Deployment for cheapskates" /><published>2018-11-08T00:00:00+01:00</published><updated>2018-11-08T00:00:00+01:00</updated><id>https://www.christianwerner.net/tech/Deployment-for-cheapskates</id><content type="html" xml:base="https://www.christianwerner.net/tech/Deployment-for-cheapskates/">&lt;p&gt;Ok, so &lt;a href=&quot;https://www.heroku.com&quot;&gt;Heroku&lt;/a&gt; is a nice but kind of expensive PaaS solution for hobby projects ($7/month per dyno). They have a free tier but apps there are deployed on-demand and thus have a nasty start-up delay. In addition, they will autosleep after 30min. As an alternative, let’s use &lt;a href=&quot;http://dokku.viewdocs.io/dokku/&quot;&gt;Dokku&lt;/a&gt; the “Docker powered mini-Heroku”. They also have a another nice slogan: “Own Your PaaS. Infrastructure at a fraction of the cost”. Sounds good to me. In essence, Dokku gives you your own Heroku. After installation you can push Heroku-compatible applications via git. They autobuild using Heroku buildpacks and then run in isolated containers.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/dokku.png&quot; alt=&quot;Meet dokku&quot; /&gt;
  
    &lt;figcaption&gt;
      Meet dokku: The smallest PaaS implementation you’ve ever seen

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;To host Dokku, there are multiple options and people seem to like &lt;a href=&quot;https://www.digitalocean.com&quot;&gt;Digital Ocean&lt;/a&gt; who provide a Dokku droplet so you start deploying pretty much right away. But we want to go even cheaper (cheapskate, you know)!  As an alternative, I will present a solution that’ll cost you about 3 bucks a month and you can potentially host multiple apps with this (depending on your RAM and disk requirement).&lt;/p&gt;

&lt;h2 id=&quot;start-a-virtual-private-cloud-instance&quot;&gt;Start a Virtual Private Cloud instance&lt;/h2&gt;
&lt;p&gt;We need a virtual server/ Virtual Private Cloud (VPC) instance to install Dokku. I chose a German hosting company called &lt;a href=&quot;https://www.hetzner.de&quot;&gt;Hetzner&lt;/a&gt;. I use the &lt;a href=&quot;https://www.hetzner.de/cloud&quot;&gt;base-level CX11 vCPU instance&lt;/a&gt; that features 1 vCPU, 2GB of RAM and 20GB NVMe SSD and 20TB of traffic. Then I choose the linux system for the instance (I opted for &lt;a href=&quot;http://releases.ubuntu.com/18.04/&quot;&gt;Ubuntu 18.04 LTS&lt;/a&gt;). Next I create a passwordless ssh key and add the public key to the instance.&lt;/p&gt;

&lt;h2 id=&quot;install-dokku&quot;&gt;Install Dokku&lt;/h2&gt;
&lt;p&gt;Now we ssh into the instance and first set the hostname (replace the IP and hostname with your info).&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;100.100.100.100 dokku.mydomain.net dokku&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; /etc/hosts
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;We also add the dokku repository to the system package control (incl. GPG keys).&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;wget &lt;span class=&quot;nt&quot;&gt;-nv&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-O&lt;/span&gt; - https://packagecloud.io/gpg.key | apt-key add
&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;deb https://packagecloud.io/dokku/dokku/ubuntu/ bionic main&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; /etc/apt/sources.list.d/dokku.list
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then we proceed to update the packages and install docker dependencies.&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get update
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get install apt-transport-https ca-certificates &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	curl software-properties-common
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we add another repository (this time for docker):&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
    &quot;deb [arch=amd64] https://download.docker.com/linux/ubuntu \
    $(lsb_release -cs) stable&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Finally we install docker-ce and dokku to the system.&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# When asked, select YES to enable web setup&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get update
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get upgrade

&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get install docker-ce
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get install dokku
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You now also should have a new user and group &lt;em&gt;Dokku&lt;/em&gt; in the system&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;id dokku
&lt;span class=&quot;c&quot;&gt;# uid=1000(dokku) gid=1000(dokku) groups=1000(dokku),4(adm),999(docker)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If all went well the service should be running by default after the installation (service name: &lt;em&gt;dokku-installer.service&lt;/em&gt;). To check type this into your terminal:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;systemctl status dokku-installer.service
systemctl is-enabled dokku-installer.service
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Finish the installation by adding core dependencies:&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;dokku plugin:install-dependencies &lt;span class=&quot;nt&quot;&gt;--core&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On the machine use the &lt;em&gt;dokku&lt;/em&gt; command to start and stop apps, see logs and configure things. See the &lt;a href=&quot;http://dokku.viewdocs.io/dokku~v0.12.13/getting-started/installation&quot;&gt;help pages&lt;/a&gt; here for details.&lt;/p&gt;

&lt;h2 id=&quot;deploy-an-app&quot;&gt;Deploy an app&lt;/h2&gt;
&lt;p&gt;Dokku relies on git for deployment. First, make sure you have a local repository setup on your local machine (i.e. &lt;em&gt;git init, …&lt;/em&gt;). Then you need to add a deployment remote where the app will be pushed to (this is your dokku server you just set up). In the command below, &lt;em&gt;my-app&lt;/em&gt; will also be used by dokku to create your app subdomain when deployed. Thus, the command will host your app at &lt;em&gt;my-app.mydomain.net&lt;/em&gt; . With git push you simply deploy and trigger a rebuild if the app already exists. &lt;strong&gt;Done.&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git remote add dokku dokku@mydomain.net:my-app
git push dokku master
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For deploying multiple apps you also want a domain hoster that allows you to set &lt;a href=&quot;https://en.wikipedia.org/wiki/Wildcard_DNS_record&quot;&gt;wildcard DNS records&lt;/a&gt;. I started using &lt;a href=&quot;https://porkbun.com&quot;&gt;Porkbun&lt;/a&gt; for hosting my apps at &lt;a href=&quot;http://cwerner.ai&quot;&gt;http://cwerner.ai&lt;/a&gt; (note that this address redirects, to see a live app go &lt;a href=&quot;http://guitars.cwerner.ai&quot;&gt;here&lt;/a&gt;). They are cheap, offer good features (wildcard DNS entires included), and people seem to like them.&lt;/p&gt;

&lt;p&gt;A wildcard DNS record looks like this (replace the ip-address with your IP from your VPC instance):&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;A   *.mydomain.net  100.100.100.100  
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;pitfalls&quot;&gt;Pitfalls&lt;/h2&gt;
&lt;p&gt;There are some things that you should watch out for.&lt;/p&gt;

&lt;p&gt;If you are ever stuck with this error message&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; ! [remote rejected] master -&amp;gt; master (pre-receive hook declined)
error: failed to push some refs to 'dokku@mydomain.net:my-app'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There a a couple of things you should check:&lt;/p&gt;

&lt;h3 id=&quot;insufficient-memory&quot;&gt;Insufficient memory&lt;/h3&gt;
&lt;p&gt;Check that your servers’ disk and memory are sufficient. I discovered that the Ubuntu 18.04 image has only a small swapfile allocated so I manually increased that to 4GB just to be sure.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# increase swapfile to 4G&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;swapoff &lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;dd &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;/dev/zero &lt;span class=&quot;nv&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;/swapfile &lt;span class=&quot;nv&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;4G &lt;span class=&quot;nv&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;102
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;mkswap /swapfile
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;swapon /swapfile
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;insufficient-disk-space&quot;&gt;Insufficient disk space&lt;/h3&gt;

&lt;p&gt;Docker can really fill up your hard drive. So if you experiment a lot it might also be a good idea to clean up old stuff like so:&lt;/p&gt;

&lt;p&gt;Delete volumes:&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# https://github.com/chadoe/docker-cleanup-volumes&lt;/span&gt;
docker volume rm &lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;docker volume &lt;span class=&quot;nb&quot;&gt;ls&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-qf&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dangling&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;
docker volume &lt;span class=&quot;nb&quot;&gt;ls&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-qf&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dangling&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;true&lt;/span&gt; | xargs &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; docker volume rm
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Delete images:&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# http://stackoverflow.com/questions/32723111/how-to-remove-old-and-unused-docker-images&lt;/span&gt;
docker images
docker rmi &lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;docker images &lt;span class=&quot;nt&quot;&gt;--filter&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;dangling=true&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-q&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-trunc&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;

docker images | &lt;span class=&quot;nb&quot;&gt;grep&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;
docker rmi &lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;docker images | &lt;span class=&quot;nb&quot;&gt;grep&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt; | awk &lt;span class=&quot;s1&quot;&gt;'/ / { print $3 }'&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Delete containers:&lt;/p&gt;
&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# http://stackoverflow.com/questions/32723111/how-to-remove-old-and-unused-docker-images&lt;/span&gt;
docker ps
docker ps &lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt;
docker rm &lt;span class=&quot;k&quot;&gt;$(&lt;/span&gt;docker ps &lt;span class=&quot;nt&quot;&gt;-qa&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-trunc&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--filter&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;status=exited&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If this stuff does not help you might need to upgrade your machine or attach a &lt;a href=&quot;https://www.hetzner.de/cloud&quot;&gt;volume&lt;/a&gt; to your instance to offload stuff.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I think this a neat possibility to bring apps to live for cheap. Plus, you also learn a bit about PaaS and devops along the way. I will illustrate how to actually build a python based webapp in another post soon.&lt;/p&gt;</content><author><name>Christian Werner</name></author><category term="Devops" /><category term="Cloud" /><summary type="html">A short rundown of how to use Dokku to deploy web apps and machine learning models on your own system. Get your very own PaaS system on the cheap by using open source software and a low-cost virtual private cloud instance.</summary></entry><entry><title type="html">Guitar classification revisited</title><link href="https://www.christianwerner.net/tech/Guitar-classification-revisited/" rel="alternate" type="text/html" title="Guitar classification revisited" /><published>2018-10-29T00:00:00+01:00</published><updated>2018-10-29T00:00:00+01:00</updated><id>https://www.christianwerner.net/tech/Guitar-classification-revisited</id><content type="html" xml:base="https://www.christianwerner.net/tech/Guitar-classification-revisited/">&lt;p&gt;In a &lt;a href=&quot;https://www.christianwerner.net/tech/Is-this-a-Les-Paul-or-is-it-a-Strat&quot;&gt;previous post&lt;/a&gt; I created a guitar model classifier that was capable of discriminating between two iconic guitars (the &lt;a href=&quot;https://en.wikipedia.org/wiki/Gibson_Les_Paul&quot;&gt;Gibson Les Paul&lt;/a&gt; and the &lt;a href=&quot;https://en.wikipedia.org/wiki/Fender_Stratocaster&quot;&gt;Fender Stratocaster&lt;/a&gt;). Using a ResNet-34 architecture and the fastai v0.7 deep learning python package I created a model that could predict the right class with 95.4% accuracy.&lt;br /&gt;
In a next step I wanted to expand this to a multi-class classification. Since then, a &lt;a href=&quot;https://www.fast.ai/2018/10/02/fastai-ai/&quot;&gt;new version of fastai&lt;/a&gt; launched that builds on the pre-release of &lt;a href=&quot;https://pytorch.org/blog/the-road-to-1_0/&quot;&gt;pytorch 1.0&lt;/a&gt; and features a pretty different architecture but many substantial improvements. The following results are based on version 1.0.26 (but please check for updates, the library is currently evolving basically on a daily basis).&lt;/p&gt;

&lt;h2 id=&quot;fastai-v1-quick-intro&quot;&gt;Fast.ai v1 quick intro&lt;/h2&gt;

&lt;p&gt;One of the greatest improvements in version 1.0 is a proper &lt;a href=&quot;https://docs.fast.ai&quot;&gt;documentation page&lt;/a&gt;. Due to the rapid development it too changes basically daily, but most foundational APIs seem to be in place now. One noteworthy thing is that the documentation is build from &lt;a href=&quot;https://jupyterlab.readthedocs.io/en/stable/#&quot;&gt;Jupyter notebooks&lt;/a&gt; and thus all code can be run from the &lt;em&gt;doc_src&lt;/em&gt; directory at their &lt;a href=&quot;https://github.com/fastai/fastai/tree/master/docs_src&quot;&gt;GitHub repository&lt;/a&gt;. As mentioned, fastai v1 consists of four applications: &lt;a href=&quot;https://docs.fast.ai/vision.html&quot;&gt;vision&lt;/a&gt; (image classification, segmentation, etc.), &lt;a href=&quot;https://docs.fast.ai/text.html&quot;&gt;text&lt;/a&gt; (natural language processing), &lt;a href=&quot;https://docs.fast.ai/tabular.html&quot;&gt;tabular&lt;/a&gt; (structured data), and &lt;a href=&quot;https://docs.fast.ai/collab.html&quot;&gt;collab&lt;/a&gt; (collaborative filtering) [see Figure 1]. Apart from collab, all applications are structured into transformations (data pre-processing, data augmentation, tokenization, etc.), data (dataset classes for the specific use case), models (actual model architectures). Data and models are combined into a learner.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/fastai1_structure.png&quot; alt=&quot;General structure of fastai v1 (source: docs.fast.ai).&quot; /&gt;
  
    &lt;figcaption&gt;
      General structure of fastai v1 (source: docs.fast.ai).

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;One big change occurred at version 1.0.22 (I think, things move fast) when the &lt;a href=&quot;https://docs.fast.ai/data_block.html&quot;&gt;data block API&lt;/a&gt; was introduced. I will use this API in this post instead of the older dedicated vision tools as it’s a more generic option and translates nicely also into the other fastai components tabular, text, and collab. In general, you specify the following components that together are used to create a &lt;a href=&quot;https://docs.fast.ai/basic_data.html#DataBunch&quot;&gt;DataBunch&lt;/a&gt; - the container that holds training, validation and test data as well as information about data augmentation.&lt;/p&gt;

&lt;h2 id=&quot;building-a-dataset-with-fastclass&quot;&gt;Building a dataset with fastclass&lt;/h2&gt;

&lt;p&gt;Before we start we need a dataset of images. You can use one of the &lt;a href=&quot;https://docs.fast.ai/datasets.html&quot;&gt;provided datasets&lt;/a&gt;, get your data from &lt;a href=&quot;https://www.kaggle.com/datasets&quot;&gt;kaggle&lt;/a&gt;, &lt;a href=&quot;https://toolbox.google.com/datasetsearch&quot;&gt;google&lt;/a&gt;, or &lt;a href=&quot;https://www.christianwerner.net/tech/Build-your-image-dataset-faster/&quot;&gt;build your own&lt;/a&gt;. I recently wrote the small toolkit &lt;strong&gt;fastclass&lt;/strong&gt; to make the process of downloading and cleaning a custom dataset easier (see &lt;a href=&quot;https://www.christianwerner.net/tech/Build-your-image-dataset-faster&quot;&gt;blog post&lt;/a&gt; and &lt;a href=&quot;https://github.com/cwerner/fastclass&quot;&gt;GitHub repo&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; I provide the full jupyter notebook &lt;a href=&quot;https://github.com/cwerner/guitars-app/blob/master/nbs/Guitar_Classifier.ipynb&quot;&gt;here&lt;/a&gt;. The dataset can be downloaded from within the notebook off a dropbox &lt;a href=&quot;https://www.dropbox.com/s/2a9oboj6dcoykt0/guitars.tgz?dl=1&quot;&gt;link&lt;/a&gt;. The guitar dataset consists of approx. 8500 images from 11 different guitar classes (five Fender models and six Gibsons).&lt;/p&gt;

&lt;h2 id=&quot;starting-resnet-34&quot;&gt;Starting: ResNet-34&lt;/h2&gt;

&lt;p&gt;Using the new data block API we build a DataBunch from the images in our dataset. Since we reuse this bit of code later I divided it into two parts (training and validation data split; image transformation):&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Since src will be reused later and we need to have the same&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# images in train and validation sets to avoid data leakage&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ImageItemList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pathlib&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cwd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'data/guitars'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
           &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random_split_by_pct&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
           &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label_from_folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;classes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;classes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Return a DataBunch with specified image and batch&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# size&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;224&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;get new databunch with requested resolution&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;transform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_transforms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;do_flip&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
               &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;databunch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;normalize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;imagenet_stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# example: get a databunch with images of size 299x299 and&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#          a batch size of 32&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To have a quick look we can display a sample of the data with:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;show_batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;figsize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;figure class=&quot;align-center width-75&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/sample_of_guitars_from_batch.png&quot; alt=&quot;A batch of images from the dataset.&quot; /&gt;
  
    &lt;figcaption&gt;
      A batch of images from the dataset.

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;To get started we do not train a model from scratch as a model pretrained on a large image dataset is always preferable to learning from random weight initializations. Using &lt;strong&gt;transfer learning&lt;/strong&gt; we can leverage the substantial compute efforts that went into an existing model (&lt;a href=&quot;https://arxiv.org/pdf/1512.03385.pdf&quot;&gt;ResNet-34&lt;/a&gt; in this case, trained on &amp;gt; 1 million &lt;a href=&quot;http://www.image-net.org&quot;&gt;ImageNet&lt;/a&gt; images). We will reuse this knowledge and replace the &lt;em&gt;head&lt;/em&gt; of the model with a new set of fully-connected layers dedicated to our classification task.&lt;br /&gt;
We start with the ResNet-34 base model and a DataBunch containing images of size 224x224 (bs=64). To track our progress we specify the &lt;em&gt;error_rate&lt;/em&gt; as a metric. First, we run the learning rate finder to determine the optimal learning rate to improve our model quickly.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;learn = create_cnn(data, models.resnet34, path='.', metrics=error_rate)
learn.lr_find(); 
learn.recorder.plot()
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;figure class=&quot;align-center width-half&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/lrfinder1.png&quot; alt=&quot;Using the learning rate finder to determine the optimum learning rate.&quot; /&gt;
  
    &lt;figcaption&gt;
      Using the learning rate finder to determine the optimum learning rate.

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;As is clear from the plot, we want to find a learning rate that gives us the smallest loss rate while making the biggest steps in the feature space. As a rule of thump we thus find the lowest point on the curve before the loss shoots up again and go one magnitude to the left (0.01 in this case). We then train the model for five cycles using the &lt;em&gt;fit_one_cycle()&lt;/em&gt; method. The &lt;a href=&quot;https://sgugger.github.io/the-1cycle-policy.html&quot;&gt;one cycle policy&lt;/a&gt; is a great technique of setting the hyper parameters (learning rate, momentum and weight decay) in a way to train complex models fast and efficient (it’s the standard approach in fastai). In essence, we want the biggest possible learning rate (determined by &lt;em&gt;lr_find()&lt;/em&gt;) to explore the feature space efficiently. Second, the learning rate changes in a cycle from a low value (10 times lower than the lf_find() result) up to the maximum and then back down again. &lt;a href=&quot;https://sgugger.github.io/the-1cycle-policy.html&quot;&gt;It was observed&lt;/a&gt; that the high learning rates at the middle of a cycle also act as regularization method that prevents overfitting. In addition, the momentum of the stochastic gradient descent (SGD) is altered in an anti-cyclical pattern.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.01&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit_one_cycle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;slice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After only 4:34 min on a &lt;a href=&quot;https://www.nvidia.com/en-gb/data-center/tesla-k80/&quot;&gt;K80 GPU&lt;/a&gt; we already have a model capable of predicting the right guitar model from a set of eleven classes with 95.1% accuracy!&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Total &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 04:34
epoch  train_loss  valid_loss  error_rate
1      0.928203    0.450700    0.162353    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;00:57&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
2      0.536406    0.311858    0.098824    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;00:54&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
3      0.347128    0.200679    0.065882    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;00:54&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
4      0.225095    0.175412    0.053529    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;00:54&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
5      0.162299    0.157303    0.049412    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;00:54&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We save the model and proceed to improve it by fine-tuning also the lower layers in the architecture (up till now we only trained the new &lt;em&gt;head&lt;/em&gt; of the model). First, we unfreeze the model (now all weights will be trained) and run the learning rate finder again to determine the optimal learning rate.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars-v1-11cl-res34-224px-01'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unfreeze&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr_find&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recorder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We set the learning rate to 1e-05 for the lower layers of the model and 0.005 for the head (this is called a &lt;strong&gt;discriminate learning rate&lt;/strong&gt;) as we do not want to destroy to learned features in the lowest layers. Those detect simple features (edges, gradients, simple patterns) that should be pretty universal for all kinds of images. We got them from the pertained model for free and they are based on the model learning from millions of images. After five more cycles (another 6:20min of training) we end up with a model that can predict with 97.3% accuracy.&lt;/p&gt;

&lt;p&gt;When we inspect the confusion matrix of the model, we can see where the model get’s it wrong.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# plot a confusion matrix&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;interp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ClassificationInterpretation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_learner&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;interp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot_confusion_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;figsize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# show largest classification errors&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;display&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;most_confused&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;min_val&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;figure class=&quot;align-center width-75&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/confusionmatrix1.png&quot; alt=&quot;Confusion matrix for our customized ResNet-34 model.&quot; /&gt;
  
    &lt;figcaption&gt;
      Confusion matrix for our customized ResNet-34 model.

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;Seems the model has a hard time differentiating between a Fender Jaguar and Jazzmaster (who wouldn’t - they are super similar). Dito for the Gibson ES and Les Paul (here, special models exist that lend features from the other guitar ranges, i.e. f-holes, pickup configurations, …).&lt;/p&gt;

&lt;h2 id=&quot;level-up-resnet-50&quot;&gt;Level up: ResNet-50&lt;/h2&gt;

&lt;p&gt;While this result is already quite impressive, we so far only used a relative simple model architecture. We now progress to &lt;a href=&quot;https://arxiv.org/pdf/1512.03385.pdf&quot;&gt;ResNet-50&lt;/a&gt;, that features substantially more layers and thus weights that can potentially learn more features of our data. To not exceed our GPU memory we have to reduce the batch size now from 64 to 32.&lt;/p&gt;

&lt;p&gt;First, we build a new DataBunch with the same train/ validation split but the smaller bs=32. We then create a new model based on the ResNet-50 architecture and run our learning rate finder again (the optimum learning rate seems to be 0.01). We immediately train the model for five cycles.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;224&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;create_cnn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;models&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resnet50&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'.'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metrics&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;error_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freeze&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr_find&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt; 
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recorder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.01&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit_one_cycle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;slice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then, we again train the entire model architecture with discriminative learning rates:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars-v1-11cl-res50-224px-01'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unfreeze&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr_find&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recorder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit_one_cycle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;slice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1e-6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;save&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars-v1-11cl-res50-224px-02'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After these 2x5 cycles we now have an accuracy of 98%.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Total &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 10:28
epoch  train_loss  valid_loss  error_rate
1      0.591651    0.319690    0.105294    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:11&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
2      0.461109    0.398586    0.115294    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:04&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
3      0.292784    0.192599    0.067647    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:04&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
4      0.178708    0.128503    0.041176    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:04&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
5      0.098652    0.102441    0.033529    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:03&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;

Total &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 13:47
epoch  train_loss  valid_loss  error_rate
1      0.122527    0.123221    0.032353    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:46&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
2      0.131352    0.129197    0.040588    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:45&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
3      0.084470    0.085018    0.028235    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:45&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
4      0.055003    0.071305    0.022353    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:45&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
5      0.035648    0.065091    0.020000    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;02:45&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;progressive-resizing&quot;&gt;Progressive resizing&lt;/h2&gt;

&lt;p&gt;In order to improve the model even more, we now use a technique called &lt;strong&gt;progressive resizing&lt;/strong&gt;. We feed the model larger versions of our images (448x448px instead of the previous 224x224) and again reduce our batch size (bs=16).&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# load the previous model version from storage&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars-v1-11cl-res50-224px-02'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# feed the new data (448x448px)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;448&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freeze&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The learning rate finder tells us to use a maximum learning rate of 0.001 and thus we train the head of the model for five cycles.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;Lr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.001&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit_one_cycle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;slice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;save&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars-v1-11cl-res50-448px-01'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With the bigger architecture and substantially larger images we now have to wait for 38 minutes.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Total &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 37:46
epoch  train_loss  valid_loss  error_rate
1      0.183134    0.086426    0.028235    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;07:42&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
2      0.099537    0.067973    0.020588    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;07:30&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
3      0.091131    0.060259    0.015294    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;07:31&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
4      0.062417    0.050117    0.013529    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;07:30&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
5      0.049533    0.048065    0.013529    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;07:31&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;However, as you can see the accuracy of the model improved drastically! Compared to the previous model, we now have an accuracy of 98.6% (a relative error rate improvement of 30%!). Again, we also train the full model.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars-v1-11cl-res50-448px-01'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unfreeze&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit_one_cycle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;slice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1e-06&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This takes even longer (49:50min):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Total &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 49:52
epoch  train_loss  valid_loss  error_rate
1      0.067349    0.044458    0.015294    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;10:02&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
2      0.055378    0.056939    0.015294    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;09:57&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
3      0.050544    0.045030    0.011765    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;09:57&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
4      0.034476    0.040948    0.012353    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;09:57&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
5      0.032105    0.041326    0.011765    &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;09:57&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We improve the accuracy again: the final model now has an accuracy of &lt;strong&gt;98.8%&lt;/strong&gt;. If we check the confusion matrix we see that almost all validation files are predicted correctly.&lt;/p&gt;

&lt;figure class=&quot;align-center width-75&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/confusionmatrix2.png&quot; alt=&quot;Confusion matrix of the final model.&quot; /&gt;
  
    &lt;figcaption&gt;
      Confusion matrix of the final model.

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;As shown, it takes relative little effort to build a custom image classifier capable of some extremely high accuracy. Using a deep learning library like fastai, a pre-trained model architecture, a reasonably-size dataset and some tricks can get you a long way!&lt;/p&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s next&lt;/h2&gt;

&lt;p&gt;In the next blog posts I will look at Class Activation Maps to see which regions of an image actually ‘trigger’ the classification. Furthermore, I want to write a small post about how to deploy the model with a &lt;a href=&quot;http://guitars.cwerner.ai&quot;&gt;flask web app&lt;/a&gt;. So stay tuned.&lt;/p&gt;

&lt;p&gt;The notebook can be found &lt;a href=&quot;https://github.com/cwerner/guitars-app/blob/master/nbs/Guitar_Classifier.ipynb&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;</content><author><name>Christian Werner</name></author><category term="CNN" /><category term="fastai" /><category term="Python" /><category term="Guitar" /><category term="Deep Learning" /><summary type="html">A step-by-step description of how to use the new fastai v1 deep learning toolbox to build a state-of-the-art image classifier for your classification goal with less than 2 hours of model training.</summary></entry><entry><title type="html">Build your image dataset faster</title><link href="https://www.christianwerner.net/tech/Build-your-image-dataset-faster/" rel="alternate" type="text/html" title="Build your image dataset faster" /><published>2018-10-25T00:00:00+02:00</published><updated>2018-10-25T00:00:00+02:00</updated><id>https://www.christianwerner.net/tech/Build-your-image-dataset-faster</id><content type="html" xml:base="https://www.christianwerner.net/tech/Build-your-image-dataset-faster/">&lt;p&gt;If there is one thing cumbersome in doing &lt;a href=&quot;https://en.wikipedia.org/wiki/Deep_learning&quot;&gt;deep learning&lt;/a&gt; - apart from fiddling around with hyper parameters - it is to actually &lt;em&gt;get&lt;/em&gt; the data to train on in the first place. You can download some excellent training datasets from &lt;a href=&quot;https://www.kaggle.com/datasets&quot;&gt;Kaggle&lt;/a&gt;, but if you want to solve your own tasks you’ll have to build your very own image dataset.&lt;/p&gt;

&lt;figure class=&quot;align-right width-half&quot;&gt;
  &lt;img src=&quot;https://imgs.xkcd.com/comics/is_it_worth_the_time.png&quot; alt=&quot;xkcd: Is It Worth the Time?&quot; /&gt;
  
    &lt;figcaption&gt;
      xkcd: Is It Worth the Time?

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;Luckily &lt;a href=&quot;http://cs231n.github.io/transfer-learning/&quot;&gt;transfer learning&lt;/a&gt; drastically reduces the required number of images for most classification problems, but you still have to come up with 100s to 1000s of images and (depending on the accuracy you’re after and the number of classes you require) this can be challenging.&lt;/p&gt;

&lt;p&gt;Recently I struggled with this problem myself and after consulting the xkcd time vs. effort chart I created the python package &lt;a href=&quot;https://github.com/cwerner/fastclass&quot;&gt;fastclass&lt;/a&gt; to make the process less painful.&lt;/p&gt;

&lt;h2 id=&quot;fastclass&quot;&gt;FastClass&lt;/h2&gt;

&lt;p&gt;You can get the script by simply installing from my GitHub like so:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;pip install git+https://github.com/cwerner/fastclass.git#egg&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;fastclass
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will install two script in your $PATH: &lt;strong&gt;fastclass download (fcd)&lt;/strong&gt; to pull images from various sites in the web, and &lt;strong&gt;fastclass clean (fcc)&lt;/strong&gt; that is used to visually inspect the often messy results from such internet crawling.&lt;/p&gt;

&lt;h3 id=&quot;step-1-fastclass-download&quot;&gt;Step 1: FastClass download&lt;/h3&gt;

&lt;p&gt;To download image categories from the net you first need to create a query csv file. The package comes with on example that should be located in the install location (your site-packages/fastclass folder).&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;head &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; 3 example/guitars.csv
searchterm,exclude
guitar gibson les paul,guitar
guitar gibson SG,guitar
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the example, 25 different search terms are listed (column searchterm). In addition you specify exclusion terms. These are keywords you need for a successful search but don’t want to use as class labels (search and exclusion terms are separated with whitespace).&lt;/p&gt;

&lt;p&gt;You start the download from the command line:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; fcd &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; ALL &lt;span class=&quot;nt&quot;&gt;-k&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; guitars example/guitars.csv 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;figure class=&quot;align-right width-half&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/screenshot_fcd.png&quot; alt=&quot;FastClass download&quot; /&gt;
  
    &lt;figcaption&gt;
      FastClass download

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;This will use all three search crawlers (Google, Bing, and Baidu), resize any image it downloads to the default size (299x299px) but also keep the originals, and store the files in the folder ‘guitars’. For details just use the help flag (‘-h’).&lt;/p&gt;

&lt;p&gt;When the script is finished you will find subfolders for each row of your query csv file in the specified dataset folder. Furthermore, a log file containing the source URL for each image is reported. The source URL is also embedded as an EXIF tag:UserComment in the resized images. Duplicated images are detected and removed automatically.&lt;/p&gt;

&lt;h3 id=&quot;step-2-fastclass-clean&quot;&gt;Step 2: FastClass clean&lt;/h3&gt;

&lt;p&gt;Once the images are located on your drive you can inspect them quickly for the tool &lt;strong&gt;fcc&lt;/strong&gt;. Call it by pointing to the category subfolder you want to inspect:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; fcc guitars/gibson_les_paul 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;figure class=&quot;align-right width-half&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/screenshot_fcc.png&quot; alt=&quot;FastClass clean&quot; /&gt;
  
    &lt;figcaption&gt;
      FastClass clean

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;This will quickly launch a GUI with the first image. Use the arrow keys to navigate. To rate the file or choose a class by pressing the keys [1] to [9]. With [d] you can mark it for deletion and with [x] you terminate the script. Afterwards you will find a copy of the files that were not marked for exclusion and a report file with your ratings.&lt;/p&gt;

&lt;p&gt;In future updates I want to improve the interface and possibly store the image information in a database to reduce clutter. I hope it is useful to you and in case of any issues please create an issue at &lt;a href=&quot;https://github.com/cwerner/fastclass/issues&quot;&gt;https://github.com/cwerner/fastclass/issues&lt;/a&gt; or sent me a pull request.&lt;/p&gt;</content><author><name>Christian Werner</name></author><category term="CNN" /><category term="Python" /><category term="Deep Learning" /><summary type="html">So you want to build your own image dataset to train a fancy deep learning model? This small tool helps to make generating custom image datasets for building image classifiers less painful.</summary></entry><entry><title type="html">AxeNet - Guitar Classifier App</title><link href="https://www.christianwerner.net/projects/guitar-app/" rel="alternate" type="text/html" title="AxeNet - Guitar Classifier App" /><published>2018-10-01T00:00:00+02:00</published><updated>2018-10-01T00:00:00+02:00</updated><id>https://www.christianwerner.net/projects/guitar-app</id><content type="html" xml:base="https://www.christianwerner.net/projects/guitar-app/">&lt;p&gt;A web application that allows you to classify a guitar image. Using transfer learning, fast.ai and the image downloader fastclass.&lt;/p&gt;

&lt;figure class=&quot;width-half&quot;&gt;
  &lt;img src=&quot;/images/guitar-app.jpeg&quot; alt=&quot;AxeNet Guitar Classifier&quot; /&gt;
  
    &lt;figcaption&gt;
      AxeNet Guitar Classifier App

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;Try it out at &lt;a href=&quot;http://guitars.cwerner.ai&quot;&gt;http://guitars.cwerner.ai&lt;/a&gt;&lt;/p&gt;</content><author><name>Christian Werner</name></author><category term="Python" /><category term="Apps" /><category term="CNN" /><category term="Deep Learning" /><summary type="html">A web application that allows you to classify a guitar image. Using transfer learning, fast.ai and the image downloader fastclass.</summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.christianwerner.net/%7B%22header%22=%3E%22guitar-app.jpeg%22,%20%22teaser%22=%3E%22guitar-app.jpeg%22%7D" /></entry><entry><title type="html">Is this a Les Paul or is this a Strat?</title><link href="https://www.christianwerner.net/tech/Is-this-a-Les-Paul-or-is-it-a-Strat/" rel="alternate" type="text/html" title="Is this a Les Paul or is this a Strat?" /><published>2018-06-16T12:20:00+02:00</published><updated>2018-06-16T12:20:00+02:00</updated><id>https://www.christianwerner.net/tech/Is-this-a-Les-Paul-or-is-it-a-Strat</id><content type="html" xml:base="https://www.christianwerner.net/tech/Is-this-a-Les-Paul-or-is-it-a-Strat/">&lt;p&gt;I recently stumbled upon the most excellent podcast &lt;em&gt;“This week in Machine Learning and Artificial Intelligence”&lt;/em&gt; (&lt;a href=&quot;https://twimlai.com&quot;&gt;TWiML&amp;amp;AI&lt;/a&gt;). &lt;a href=&quot;https://twitter.com/samcharrington&quot;&gt;Sam Charrington&lt;/a&gt; is doing a wonderful job in presenting people and trending topics of all things AI and ML. Go check it out. It’s quality stuff!&lt;/p&gt;

&lt;h2 id=&quot;intro&quot;&gt;Intro&lt;/h2&gt;

&lt;h3 id=&quot;getting-to-know-fastai&quot;&gt;Getting to know fast.ai&lt;/h3&gt;

&lt;p&gt;Anyways. A recent guest on his show was Rachel Thomas &lt;a href=&quot;https://soundcloud.com/twiml/twiml-talk-138-practical-deep-learning-with-rachel-thomas&quot;&gt;(Episode #138)&lt;/a&gt; - a university professor at the University of San Francisco and co-founder of &lt;a href=&quot;http://www.fast.ai&quot;&gt;fast.ai&lt;/a&gt;. To cite their company’s mission statement: “Fast.ai is dedicated to making the power of deep learning accessible to all. Deep learning is dramatically improving medicine, education, agriculture, transport and many other fields, with the greatest potential impact in the developing world. For its full potential to be met, the technology needs to be much easier to use, more reliable, and more intuitive than it is today.” (see also a &lt;a href=&quot;http://www.fast.ai/2016/10/07/fastai-launch/&quot;&gt;blog post&lt;/a&gt; of them explaining why they do what they do).&lt;/p&gt;

&lt;p&gt;So, in essence they teach state-of-the-art deep learning (DL) for the common (wo)man by providing a &lt;a href=&quot;http://course.fast.ai&quot;&gt;free MOOC&lt;/a&gt; on their site. What’s quite unique about it is that they decided to use a top-down approach. They basically provide almost no introduction to the basis of the field but have the students train their first deep convolutional neural network with just three lines of code and go from there… Later, they peel layer for layer and expose more and more details about the underlying fundamentals that make the machinery work. The idea is that this supposedly keeps students engaged and helps to facilitate different learning paces and styles. To make all this happen they designed a high-level wrapper that sits on top of the deep learning framework &lt;a href=&quot;https://pytorch.org&quot;&gt;PyTorch&lt;/a&gt; - apparently in a similar way as &lt;a href=&quot;https://keras.io&quot;&gt;Keras&lt;/a&gt; provides a more gentle interface to &lt;a href=&quot;https://github.com/tensorflow/tensorflow&quot;&gt;TensorFlow&lt;/a&gt;. As far as I understood, it was originally designed as a help for their courses but matured into a rather stable general-purpose DL library that might also be used for production…&lt;/p&gt;

&lt;p&gt;Given that I currently teach a university course (&lt;em&gt;Remote Sensing of Global Ecology (using R)&lt;/em&gt;) that is structured on the conventional bottom-up approach I was a intrigued about this style of teaching.&lt;/p&gt;

&lt;h3 id=&quot;joining-the-group&quot;&gt;Joining the group&lt;/h3&gt;

&lt;p&gt;Sam initiated a &lt;a href=&quot;https://twimlai.com/twiml-x-fast-ai/&quot;&gt;study group&lt;/a&gt; shortly after the fast.ai interview. I thought I’d also check this out and so I joined to keep me motivated and here we are.&lt;/p&gt;

&lt;p&gt;Now, in the first lesson students build the (unavoidable?) cat classifier (here it’s a cat vs. dog classifier). The whole model requires only three lines of Python code! Obviously this only works since a lot of hyper-parameters are hidden by the basic interface and a lot of choices are done by the fast.ai package by default and data is provided. Furthermore, the example uses &lt;a href=&quot;http://cs231n.github.io/transfer-learning/&quot;&gt;transfer learning&lt;/a&gt; and thus really piggybacks on a large existing model that was trained on the massive &lt;a href=&quot;http://www.image-net.org&quot;&gt;ImageNet dataset&lt;/a&gt;. Nevertheless it really is quite amazing that you can get things from the ground with this little code.&lt;/p&gt;

&lt;h2 id=&quot;the-project&quot;&gt;The project&lt;/h2&gt;

&lt;p&gt;As an exercise students are asked to come up with their own (binary) classification problems and so I thought I’d build a guitar classifier net. To start things off I decided to go for arguably two of the most iconic electrical guitars: the &lt;a href=&quot;https://en.wikipedia.org/wiki/Gibson_Les_Paul&quot;&gt;Gibson (R.I.P) Les Paul&lt;/a&gt; and the &lt;a href=&quot;https://en.wikipedia.org/wiki/Fender_Stratocaster&quot;&gt;Fender Stratocaster&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, a couple of things up first. While the two instruments feature very characteristic body shapes, headstocks and geometries guitars tend to come in all kinds of designs and configurations. So I’d imagine that this task is at least as challenging as differentiating between a fluffy cat and a (less so?) dog - if not much more.&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/guitars_are_cooler_than_cats.jpg&quot; alt=&quot;Guitars trump Cats&quot; /&gt;
  
    &lt;figcaption&gt;
      Well, obviously!

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h3 id=&quot;connecting-to-notebook-server&quot;&gt;Connecting to notebook server&lt;/h3&gt;

&lt;p&gt;Since I work on Macs (and none features a decent Nvidia GPU) I ssh into a GPU-equipped server that runs the &lt;a href=&quot;http://jupyter.org&quot;&gt;jupyter notebook&lt;/a&gt; with GPU acceleration (it also features an anaconda installation and has the fast.ai and other required python libraries installed, I’m not discussing the setup here).&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# activate the anaconda environment&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;source &lt;/span&gt;activate fastai

&lt;span class=&quot;c&quot;&gt;# start a jupyter instance &lt;/span&gt;
jupyter lab &lt;span class=&quot;nt&quot;&gt;--port&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;9000 &lt;span class=&quot;nt&quot;&gt;--no-browser&lt;/span&gt; &amp;amp;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On my machine I open a ssh tunnel and bind the local port 8888 to port 9000 of the remote machine:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# connect ports&lt;/span&gt;
ssh &lt;span class=&quot;nt&quot;&gt;-N&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-L&lt;/span&gt; 8888:localhost:9000 cwerner@MY_GPU_SERVER
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now I simply access the notebook via my browser at https://localhost:8888 (I also set it to be password protected).&lt;/p&gt;

&lt;h3 id=&quot;data-setup&quot;&gt;Data setup&lt;/h3&gt;

&lt;p&gt;“First, there was data…” Well, there needs to be anyways. So one convenient way of getting hold of image data is to use &lt;a href=&quot;https://images.google.com&quot;&gt;Google Image Search&lt;/a&gt;. There is a neat &lt;a href=&quot;https://chrome.google.com/webstore/detail/fatkun-batch-download-ima/nnjjahlikiabnchcpehcpkdeckfgnohf?hl=en&quot;&gt;Chrome extension&lt;/a&gt; for image harvesting - but I find that Chrome still plain sucks on a Mac so I went for another python library called &lt;a href=&quot;https://github.com/hardikvasa/google-images-download&quot;&gt;google_images_download&lt;/a&gt; that does the same job.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# install google image downloader and pull images&lt;/span&gt;
pip install google_images_download

&lt;span class=&quot;c&quot;&gt;# get two batches of 1000 images (Gibson Les Pauls, Fender Stratocasters)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# I had to specify the location of chromdwriver, too&lt;/span&gt;
googleimagesdownload &lt;span class=&quot;nt&quot;&gt;-k&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;gibson les paul&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-pr&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;gibson_lp&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-th&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; gibson_lp &lt;span class=&quot;nt&quot;&gt;-l&lt;/span&gt; 1000 &lt;span class=&quot;nt&quot;&gt;--chromedriver&lt;/span&gt; /usr/local/bin/chromedriver
googleimagesdownload &lt;span class=&quot;nt&quot;&gt;-k&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;fender stratocaster&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-pr&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;fender_strat&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-th&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; fender_strat &lt;span class=&quot;nt&quot;&gt;-l&lt;/span&gt; 1000 &lt;span class=&quot;nt&quot;&gt;--chromedriver&lt;/span&gt; /usr/local/bin/chromedriver
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For some reason the script only managed to pull ~500 images (which should still be enough for the exercise), but to get a better dataset I found that I had to manually weed through the files and delete files with missing suffixes, wrong classification, or only showing guitar parts. A neat way to do this on the Mac is simply to use the quick view in Finder and scroll through the directory and delete as necessary. I also only used the thumbnail images as the scripts currently only uses 224x224px images anyways.&lt;/p&gt;

&lt;p&gt;Finally, I wrote some quick lines of code that created a file structure suitable for the fast.ai &lt;em&gt;ImageClassifierData&lt;/em&gt; object.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;glob&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;math&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;random&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;shutil&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# path structure: &lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# guitars_small/gibson_lp/gibson_lp.1.imagedescription.jpg&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;gibson_files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars_small/gibson*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;fender_files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars_small/fender*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dir&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gibson_files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fender_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# create fast.ai data folder structure&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dname&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;npath1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars_small'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'train'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;npath2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'guitars_small'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'valid'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
		    &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;makedirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;npath1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exist_ok&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
		    &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;makedirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;npath2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exist_ok&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;pass&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# split files into train and validation sets (80/20)   &lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;all_files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'*.jpg'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shuffle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cut&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ceil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.8&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;train_files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cut&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;valid_files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cut&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:]]&lt;/span&gt;
		
	&lt;span class=&quot;c&quot;&gt;# copy files into appropriate folders&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;train_files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;shutil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;copy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;npath1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;the-model&quot;&gt;The model&lt;/h3&gt;

&lt;p&gt;Once the files are copied to the remote server we can create the model.
First let’s import libraries and define some defaults.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;torch&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;fastai.transforms&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;fastai.conv_learner&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;fastai.model&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;fastai.dataset&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;fastai.sgdr&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;fastai.plots&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# some constants&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;PATH&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;data/guitars_small&quot;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# the data path&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;224&lt;/span&gt;                    &lt;span class=&quot;c&quot;&gt;# the image size &lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;                     &lt;span class=&quot;c&quot;&gt;# the batch size&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we load and fit a pre-trained model (&lt;a href=&quot;https://github.com/KaimingHe/deep-residual-networks&quot;&gt;Resnet34&lt;/a&gt;)
to our dataset. The line containing &lt;em&gt;learn.fit()&lt;/em&gt; executes the model training (using a learning_rate of 0.01
and for two epochs).&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;arch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;resnet34&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImageClassifierData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_paths&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PATH&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tfms&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tfms_from_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConvLearner&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pretrained&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;precompute&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.01&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/jupyter_image1.png&quot; alt=&quot;First model run&quot; /&gt;
  
    &lt;figcaption&gt;
      The output of the first model training using our Les Paul vs. Strat dataset

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;This already gives us an accuracy of 92.9%. Pretty remarkable. Now, lets see what the model
 identifies correctly and where it fails (I’m using some helper functions from the course for this which
 I do not show in this post to save space, see the GitHub repository for the full code listing; 
0 = Fender Strat, 1 = Gibson Les Paul).&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/jupyter_image2_correct_incorrect.jpg&quot; alt=&quot;First classification results&quot; /&gt;
  
    &lt;figcaption&gt;
      Some examples of correctly and incorrectly classified images

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h3 id=&quot;improving-the-model&quot;&gt;Improving the model&lt;/h3&gt;

&lt;p&gt;In essence the lesson suggests to add these improvements to get even better results:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;data augmentation (to vary the training data by scaling, flipping and tilting images; 
essentially adding more labeled data)&lt;/li&gt;
  &lt;li&gt;fine tuning the model layers (unfreeze early layers)&lt;/li&gt;
  &lt;li&gt;adding learning rate annealing&lt;/li&gt;
  &lt;li&gt;add data augmentation at inference time&lt;/li&gt;
&lt;/ol&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# define data augmentation (we use transforms_top_down since guitars&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# could be depicted from all kins of angles (the other choice would be&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# transforms_side_on)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;tfms&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tfms_from_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resnet34&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aug_tfms&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;transforms_top_down&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max_zoom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# new data object with transforms&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImageClassifierData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_paths&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PATH&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tfms&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tfms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# start the training&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ConvLearner&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pretrained&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;precompute&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1e-2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# since the model was pretrained with precimpute=True data augmentation takes no effect.&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# To add this we need to switch precompute to False&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;precompute&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# we now add some more epochs of training (using stochastic gradient descent with restart)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1e-2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cycle_len&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We now have a model where the last layer was trained while all previous layers are still
 frozen to the original ImageNet weights. To give the model some wiggle room to fine-tune 
the network to our classification domain we can unfreeze the early layers, too, and provide
a separate learning rate to early, central and late layers in the model (the idea is to have
small learning rates for the early layers as they should be rather generic and larger learning
rates for the late layers as they characterise more specific concepts).&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unfreeze&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1e-4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1e-3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1e-2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cycle_len&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cycle_mult&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Test Time Augmentation (TTA)&lt;/strong&gt; was something I never heard about but apparently it can further
improve results quite a bit. TTA computes 4 augmented test images and judges the quality at test
time based on the majority vote on all five images which helps the model to generalise better.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;log_preds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TTA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;probs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;log_preds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;accuracy_np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;probs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In my setup this final model now achieves an accuracy of 95.4%. Given the diverse input data and
 relatively small sample set I find that quite amazing.&lt;/p&gt;

&lt;h3 id=&quot;some-evaluation&quot;&gt;Some evaluation&lt;/h3&gt;

&lt;p&gt;First, let’s look at the confusion matrix. This illustrates the accuracy of the model for the
individual classes (the diagonal is the correct prediction for all classes). In total there 
were 102 Strat and 94 Les Paul images in the validation dataset (the split was 80/20 of the
 total images).&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/jupyter_image3_confusion_matrix.png&quot; alt=&quot;Confusion matrix&quot; /&gt;
  
    &lt;figcaption&gt;
      Confusion matrix of model predictions

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;p&gt;As we can see the model incorrectly predicted three Les Pauls to be Strats and six Strats for
Les Pauls. Now let’s look again at some images (top: most confident Fenders, middle: most confident Gibsons,
bottom: most uncertain images)&lt;/p&gt;

&lt;figure class=&quot;&quot;&gt;
  &lt;img src=&quot;/assets/images/posts/jupyter_image4_finalimages.jpg&quot; alt=&quot;Final images&quot; /&gt;
  
    &lt;figcaption&gt;
      Best classification results and results where the model is most uncertain about the class.

    &lt;/figcaption&gt;&lt;/figure&gt;

&lt;h3 id=&quot;next-steps&quot;&gt;Next steps&lt;/h3&gt;

&lt;p&gt;Now, while the results are not as great as the dog vs. cat classifier in the fast.ai lesson that consisted
 of a much larger dataset, I still believe results are quite neat. I currently think about the following
 steps for further experiments:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Get more images: the number of images is still small (400 training, 100 validation images per class)&lt;/li&gt;
  &lt;li&gt;Try the same exercise with the full resolution images (I have the feeling that the thumbnails are too 
small for the network and &lt;em&gt;sz&lt;/em&gt; setting&lt;/li&gt;
  &lt;li&gt;Further selection of images that are a) too small, b) have multiple guitars in the image, show too little
 of the guitar (some only had the fingerboard or the headstock), c) remove images with backside shots.&lt;/li&gt;
  &lt;li&gt;I also want to extend this to mulit-class classification: Gibson Les Paul, SG, Firebird, Explorer and Fender
 Stratocaster, Jaguar, Mustang and Telecaster. This should be interesting!&lt;/li&gt;
&lt;/ol&gt;</content><author><name>Christian Werner</name></author><category term="CNN" /><category term="fastai" /><category term="Python" /><category term="Guitar" /><category term="Deep Learning" /><summary type="html">How to build a convolutional neural net that can discriminate between guitar models.</summary></entry><entry><title type="html">Getting things started for 2018…</title><link href="https://www.christianwerner.net/misc/Hello-World/" rel="alternate" type="text/html" title="Getting things started for 2018..." /><published>2018-01-02T00:00:00+01:00</published><updated>2018-01-02T00:00:00+01:00</updated><id>https://www.christianwerner.net/misc/Hello-World</id><content type="html" xml:base="https://www.christianwerner.net/misc/Hello-World/">&lt;p&gt;New years’ resolutions everyone.&lt;/p&gt;

&lt;p&gt;After a couple of trials and failed attempts this year will be different! A new blog (Jekyll), a new style (Minimal Mistakes) and new content. &lt;strong&gt;This should be good.&lt;/strong&gt; Expect loads of modifications in the next weeks…&lt;/p&gt;</content><author><name>Christian Werner</name></author><summary type="html">New years’ resolutions everyone.</summary></entry></feed>