-
Notifications
You must be signed in to change notification settings - Fork 22
Bug bash instructions
To install the package directly from github, use:
# Install devtools
if(!require("devtools")) install.packages("devtools")
devtools::install_github("RevolutionAnalytics/AzureML")
To use any of the functions, you need your AzureML credentials. To find these, read the vignette Getting Started with the AzureML Package.
You will need these credentials to use the function workspace()
. This function sets up your credentials in R and allows you to use all of the other functions in the package.
The easiest way to use the workspace()
function is to create a json file in the location ~/.azureml/settings.json
Copy the following and modify with your own credentials:
{"workspace":{
"id" : "Add your id here",
"authorization_token" : "Add your authorisation token here"
}}
Then save the file at ~/.azureml/settings.json
. On windows, save the file at C:\Users\<yourname>\Documents\.azureml
If you have any doubt as to the location of `~/", try:
> path.expand("~/")
[1] "C:/Users/adevries/Documents/"
Try:
- Read the help for
?workspace
,?datasets
and?download.datasets
- Create a workspace object
- Getting a listing of available datasets in your workspace
- Download a specific dataset from AzureML as a data frame
ws <- workspace()
d <- datasets(ws)
dat <- download.datasets(d, "Movie Ratings")
head(dat)
You can publish almost any R function as a web service in AzureML, subject to some input/output constraints.
- Read the help for
?publishWebService
- Try some of the examples in
?publishWebService
or?consume
Here is a more complicated example showing how to create a function that takes ordered factors as input:
# Train a model using diamonds in ggplot2
library(rpart)
data(diamonds, package="ggplot2")
set.seed(1)
test_idx = sample.int(nrow(diamonds), 30000)
train_idx = sample(setdiff(seq(1, nrow(diamonds)), test_idx), 500)
train <- diamonds[train_idx, ]
test <- diamonds[test_idx, ]
model <- glm(price ~ carat + clarity + color + cut - 1, data = train,
family = Gamma(link = "log"))
diamondLevels <- diamonds[1, ]
# The model works reasonably well, except for some outliers
plot(exp(predict(model, test)) ~ test$price)
# Create a function to publish. The function takes care of converting characters correctly to factors
predictDiamonds <- function(x){
x$cut <- factor(x$cut,
levels = levels(diamondLevels$cut), ordered = TRUE)
x$clarity <- factor(x$clarity,
levels = levels(diamondLevels$clarity), ordered = TRUE)
x$color <- factor(x$color,
levels = levels(diamondLevels$color), ordered = TRUE)
exp(predict(model, newdata = x))
}
# Publish the service
ws <- workspace()
ep <- publishWebService(ws, fun = predictDiamonds, name = "diamonds",
inputSchema = test)
Now that you've published an API, you can send data for scoring by using the function consume()
.
- Read the help for
?consume
- Try some of the examples
To consume the model you published in the previous section, try:
results <- consume(ep, test)$ans
plot(results ~ test$price)
# Compare the AzureML results with locally computed ones:
crossprod(predictDiamonds(test) - results)
Delete this example web service when you're done if you wish:
deleteWebService(ws, "diamonds")
To report issues or problems, use the issue log or send me a direct message: