Skip to content
This repository has been archived by the owner on Oct 28, 2019. It is now read-only.

Bug bash instructions

B. W. Lewis edited this page Nov 18, 2015 · 18 revisions

Getting started

1. Installing the package

To install the package directly from github, use:

# Install devtools
if(!require("devtools")) install.packages("devtools")
devtools::install_github("RevolutionAnalytics/AzureML")

2. Find your AzureML credentials

To use any of the functions, you need your AzureML credentials. To find these, read the vignette Getting Started with the AzureML Package.

You will need these credentials to use the function workspace(). This function sets up your credentials in R and allows you to use all of the other functions in the package.

3. Create a json file with your credentials

The easiest way to use the workspace() function is to create a json file in the location ~/.azureml/settings.json

Copy the following and modify with your own credentials:

{"workspace":{
"id"                  : "Add your id here",
"authorization_token" : "Add your authorisation token here"
}}

Then save the file at ~/.azureml/settings.json. On windows, save the file at C:\Users\<yourname>\Documents\.azureml

If you have any doubt as to the location of `~/", try:

> path.expand("~/")
[1] "C:/Users/adevries/Documents/"

Work with AzureML datasets

4. Download an AzureML dataset to your workspace

Try:

  • Read the help for ?workspace, ?datasets and ?download.datasets
  • Create a workspace object
  • Getting a listing of available datasets in your workspace
  • Download a specific dataset from AzureML as a data frame
ws <- workspace()
d <- datasets(ws)
dat <- download.datasets(d, "Movie Ratings")
head(dat)

Publish an R script as an AzureML Web Service

5. Publish a Web Service

You can publish almost any R function as a web service in AzureML, subject to some input/output constraints.

  • Read the help for ?publish
  • Try some of the examples in ?publish or ?consume

Here is a more complicated example showing how to create a function that takes ordered factors as input:

# Train a model using diamonds in ggplot2

library(ggplot2)
library(rpart)

set.seed(1)
test_idx = sample.int(nrow(diamonds), 30000)
train_idx = sample(setdiff(seq(1, nrow(diamonds)), test_idx), 500)
train <- diamonds[train_idx, ]
test  <- diamonds[test_idx, 500), ]

model <- glm(price ~ carat + clarity + color + cut - 1, data = train, 
             family = Gamma(link = "log"))

diamondLevels <- diamonds[1, ]

# The model works reasonably well, except for some outliers

plot(exp(predict(model, test)) ~ test$price)

# Create a function to publish. The function takes care of converting characters correctly to factors

predictDiamonds <- function(x){
  x$cut     <- factor(x$cut,     
                      levels = levels(diamondLevels$cut), ordered = TRUE)
  x$clarity <- factor(x$clarity, 
                      levels = levels(diamondLevels$clarity), ordered = TRUE)
  x$color   <- factor(x$color,   
                      levels = levels(diamondLevels$color), ordered = TRUE)
  exp(predict(model, newdata = x))
}

# Publish the service

ws <- workspace()
ep <- publishWebService(ws, fun = predictDiamonds, name = "diamonds",
                  inputSchema = test)

6. Consume the model from R

Now that you've published an API, you can send data for scoring by using the function consume().

  • Read the help for ?consume
  • Try some of the examples

To consume the model you published in the previous section, try:

results <- consume(ep, test)$ans
plot(results ~ test$price)

# Compare the AzureML results with locally computed ones:
crossprod(predictDiamonds(test) - results)

Delete this example web service when you're done if you wish:

deleteWebService(ws, "diamonds")

Reporting issues and problems

To report issues or problems, use the issue log or send me a direct message:

Clone this wiki locally