--- title: "Custom Image Classification" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Custom Image Classification} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE,eval = FALSE,echo = T) ``` ## Intro The [fastai](https://github.com/fastai/fastai) library simplifies training fast and accurate neural nets using modern best practices. See the fastai website to get started. The library is based on research into deep learning best practices undertaken at ```fast.ai```, and includes "out of the box" support for ```vision```, ```text```, ```tabular```, and ```collab``` (collaborative filtering) models. ## Preparation Via ```rvest``` package we can search and scrape any category of image. Int this example we will classify 4 cat categories: ```{r} # cat categories: https://www.purina.com/cats/cat-breeds f_n = 'cats' if(!dir.exists(f_n)) { dir.create(f_n) } ``` A function to download cat images: ```{r} library(rvest) download_pet = function(name, dest) { query = name query = gsub('\\s', '%20', query) search <- read_html(paste("https://www.google.com/search?site=&tbm=isch&q", query, sep = "=")) urls <- search %>% html_nodes("img") %>% html_attr("src") %>% .[-1] fixed_name = gsub('\\s|[[:punct:]]', '_', name) for (i in 1:length(urls)) { download.file(urls[i], destfile = file.path(dest, paste( paste(fixed_name, round(runif(1)*10000), sep = '_'), '.jpg', sep = '' ) ), mode = 'wb' ) } } ``` ## Get data Lets define cat groups: ```{r} cat_names = c('Balinese-Javanese Cat Breed', 'Chartreux Cat Breed', 'Norwegian Forest Cat Breed', 'Turkish Angora Cat Breed') ``` And iterate throught vector: ```{r} for (i in 1:length(cat_names)) { download_pet(cat_names[i], f_n) print(paste('Done',cat_names[i])) } ``` ## Dataloaders Call libraries and import dataset: ```{r} library(fastai) library(magrittr) path = 'cats' fnames = get_image_files(path) fnames[1] # cats/Turkish_Angora_Cat_Breed_8583.jpg ``` See batch: ```{r} dls = ImageDataLoaders_from_name_re( path, fnames, pat='(.+)_\\d+.jpg$', item_tfms = Resize(size = 200), bs = 15, batch_tfms = list(aug_transforms(size = 224, min_scale = 0.75), Normalize_from_stats( imagenet_stats() ) ) ) dls %>% show_batch(dpi = 200) ```