You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the openxlsx package to modify an Excel workbook, if there is a worksheet with a name that includes an ampersand (&), it results in a corrupted file. The workbook becomes unopenable in Excel due to invalid XML generated by unescaped special characters in the sheet name.
Excel reports that the file is corrupted or cannot be opened.
Additional context
The issue seems to occurs because the ampersand (&) is a special character in XML and must be escaped as & when included in XML attribute values. The openxlsx package seems to insert the sheetName directly into the XML without escaping special characters, leading to malformed XML and a corrupted Excel file.
Proposed Solution
I couldn't fully understand the full Workbook-Class, but I tried around a little bit and when you add a function, which changes the special characters, the files work again e.g. :
# Function to escape special XML charactersxmlEscape<-function(txt) {
txt<- gsub("&", "&", txt, fixed=TRUE)
txt<- gsub("<", "<", txt, fixed=TRUE)
txt<- gsub(">", ">", txt, fixed=TRUE)
txt<- gsub("'", "'", txt, fixed=TRUE)
txt<- gsub('"', """, txt, fixed=TRUE)
return(txt)
}
# Example modification in addWorksheet functionWorkbook$methods(addWorksheet=function(sheetName, ...) {
# Escape the sheet name before inserting into XMLsheetName<- xmlEscape(sheetName)
# ... existing code ...
I added the xmlEscape-Function, to the Methods: addChartSheet, setSheetName, and addWorksheet and this made opening existing Excel-Files with a & in the Sheet-Name possible again. When creating a new Sheet, the & gets displayed as &. So maybe someone, who understands the package better, can find a solution for that. :)
Best wishes,
Josh
The text was updated successfully, but these errors were encountered:
Describe the bug
When using the
openxlsx
package to modify an Excel workbook, if there is a worksheet with a name that includes an ampersand (&
), it results in a corrupted file. The workbook becomes unopenable in Excel due to invalid XML generated by unescaped special characters in the sheet name.To Reproduce
A minimal reproducible example:
Steps:
test.xlsx
in Excel.Additional context
The issue seems to occurs because the ampersand (
&
) is a special character in XML and must be escaped as&
when included in XML attribute values. Theopenxlsx
package seems to insert thesheetName
directly into the XML without escaping special characters, leading to malformed XML and a corrupted Excel file.Proposed Solution
I couldn't fully understand the full Workbook-Class, but I tried around a little bit and when you add a function, which changes the special characters, the files work again e.g. :
I added the
xmlEscape
-Function, to the Methods:addChartSheet
,setSheetName
, andaddWorksheet
and this made opening existing Excel-Files with a&
in the Sheet-Name possible again. When creating a new Sheet, the&
gets displayed as&
. So maybe someone, who understands the package better, can find a solution for that. :)Best wishes,
Josh
The text was updated successfully, but these errors were encountered: