Skip to content
This repository has been archived by the owner on Jun 7, 2023. It is now read-only.

Error in rforcecom.query when querying many records #37

Closed
nelsonauner opened this issue Apr 12, 2016 · 8 comments
Closed

Error in rforcecom.query when querying many records #37

nelsonauner opened this issue Apr 12, 2016 · 8 comments

Comments

@nelsonauner
Copy link

This bug is (possibly) related to issue #23 and the fix in PR #31.

I was able to reproduce the error in issue #23 when trying to query all opportunities from salesforce.
I noticed that there was a PR with a fix, installed the latest version from github, and then saw a new error when trying to query all opportunities.

> sessionInfo()
     .... RForcecom_0.9

> SF_SESSION <- rforcecom.login(username = getOption("sf_cred")["user_email"],
                                password = paste0(getOption("sf_cred")["password"], 
                                                getOption("sf_cred")["token"]),
                                apiVersion = "36.0")

># get all opportunities
> get_salesforce_query <- "
    SELECT Account.name, .....
    FROM Opportunity"

> salesforce_merchants <- rforcecom.query(SF_SESSION, get_salesforce_query) 

#error here!
Error in t.default(unlist(x[setdiff(seq.int(length(x)), c(list_col, attr_col))])) : 
  argument is not a matrix

># get only some opportunities
> get_salesforce_query <- "
    SELECT Account.name, .....
    FROM Opportunity
    WHERE .....'"
> salesforce_merchants <- rforcecom.query(sf_connection, get_salesforce_query)
> #no errors
@nelsonauner
Copy link
Author

Not sure how to make an actual reproducible example since our salesforce data is proprietary...

@StevenMMortimer
Copy link
Contributor

I've tried a variety of queries which all seem to be working. Anything else unusual about your query or the objects it's referencing?

These work for me:

query <- "select Account.Name, Id from Opportunity"
query <- "select Account.Name, Id, Account.Id from Opportunity"
query <- "select Account.Name, Id from Opportunity where StageName='Prospecting'"

Because every instance is different, you may need to do a little debugging yourself and let us know what you find. It's hard to replicate all possible scenarios.

@nelsonauner
Copy link
Author

I'll go ahead and try to clearly define what queries cause this error, but first wanted to say a huge thank you 🙇 🎈 for responding so quickly and trying to help. Will post again once I've narrowed this down - sounds like its an odd error on my end

@jmeb
Copy link

jmeb commented Aug 29, 2016

You're not alone... just got this error for the first time on code that used to work. Not sure if it is RForcecom changing something, APIversions, or my instance (objects have been stable). Example code that fails:

SELECT Status__c, Parent__r.Date__c, Parent__r.Case__c, Parent__r.RecordType.Name, Parent__r.GrandParent__r.Name FROM Outcome__c WHERE Parent__r.RecordType.Name in ("Possible","AlsoPossible")

Wondering if this failure has something to do with traversing up multiple parents and the listing that may accompany it?

@StevenMMortimer
Copy link
Contributor

@jmeb You're probably right that it's an issue with traversing parent hierarchies. I wrote the function RForcecom:::query_parser() that's used when trying to pull out the data.frame structure from returned XML. It's hard to anticipate all the query varieties and instance configurations for a generic parsing function. If you're willing to run snippets of the functions line-by-line to see what's going on, then you can run RForcecom.query() up until the parsing function and see the returned XML and how it moves through the parsing function:

rforcecom.query
function (session, soqlQuery, queryAll = FALSE) 
{
    if (queryAll) {
        endpointPath <- rforcecom.api.getSoqlAllEndpoint(session["apiVersion"])
    }
    else {
        endpointPath <- rforcecom.api.getSoqlEndpoint(session["apiVersion"])
    }
    URL <- paste(session["instanceURL"], endpointPath, URLencode(soqlQuery), 
        sep = "")
    OAuthString <- paste("Bearer", session["sessionID"])
    httpHeader <- httr::add_headers(Authorization = OAuthString, 
        Accept = "application/xml")
    res <- httr::GET(url = URL, config = httpHeader)
    res.content = httr::content(res, as = "text", encoding = "UTF-8")
    if (exists("rforcecom.debug") && rforcecom.debug) {
        message(URL)
    }
    if (exists("rforcecom.debug") && rforcecom.debug) {
        message(res.content)
    }
    x.root <- xmlRoot(xmlParse(res.content, asText = T))
    errorcode <- NA
    errormessage <- NA
    try(errorcode <- iconv(xmlValue(x.root[["Error"]][["errorCode"]]), 
        from = "UTF-8", to = ""), TRUE)
    try(errormessage <- iconv(xmlValue(x.root[["Error"]][["message"]]), 
        from = "UTF-8", to = ""), TRUE)
    if (!is.na(errorcode) && !is.na(errormessage)) {
        stop(paste(errorcode, errormessage, sep = ": "))
    }
    resultset <- query_parser(x.root)

@vicnett
Copy link

vicnett commented Oct 26, 2016

@ReportMort Hi! I ran into the same issue and this page is the most helpful result I got from Google, so I went ahead and followed your advice by running the RForcecom code line-by-line.

A bit of context: I am working on refactoring a script to use the latest version of RForcecom. The person who wrote the script had decided to keep version 0.7 mostly because of this very issue, but now we need to finally move to the latest version, for various reasons.

I'm still rather new to R, but I'll do my best to provide helpful details. Let me know if I'm doing anything wrong.

From the poking around that I did, it seems that the issue (at least in my case) is that the XML from the query result is improperly parsed. Only about the last 1/4 of the data remains, and I'm guessing the argument is not a matrix error is caused by that improperly-parsed XML.

My best guess is that the switch from xmlTreeParse() to xmlParse() is to blame (again, at least in my case), based on the fact that the data returned by Salesforce is complete right up to the point where xmlParse() is called.

I tested this theory by running
xmlRoot(xmlTreeParse(res.content, asText = T))
Instead of
xmlRoot(xmlParse(res.content, asText = T))
which returned the entire data set as opposed to just the last 1/4. Running the rest of the code from that previous version of rforcecom.query() returned my complete data set in a data.frame as expected (obviously, since at this point I'm just running the old code which I already knew worked fine).

Unfortunately this is a work project and I can't disclose the data. However, I can provide the query itself:

SELECT Opportunity__r.Name, URL__c FROM Additional_URL__c

The resulting data.frame is 487 observations of 2 variables (so, nothing monstruous). I glanced at the raw XML before the parsing issue and did not notice anything strange about it around the "cutoff point," so I wouldn't blame the XML returned by Salesforce.

Again, I'm still new so I'm not sure what other info I can provide. Let me know if I can do anything else to help troubleshoot.

@vicnett
Copy link

vicnett commented Nov 8, 2016

Update: While working on another script where I was having the same issue, I just discovered that if the query returns at least four columns, everything works fine. When the query returns fewer than four columns, I get the "Argument is not a matrix" error.

Pulling additional columns of data until I had at least four fixed the issue in both scripts... So, for now, I just have to pull four or more columns and drop them in R. Go figure...

@BriBecker
Copy link

This saved me @vicnett !!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants