XLConnect on github

Mirai Solutions GmbH (http://www.mirai-solutions.com) is pleased to announce the availability of XLConnect on github. Whether you want to browse the code or simply want access to the latest development version of XLConnect, visit us on github.

XLConnect can be directly installed from github using the excellent devtools package:

require(devtools)

# Installs the master branch of XLConnect
# (= current development version)
install_github("xlconnect", username = "miraisolutions",
               ref = "master")

# Installs XLConnect 0.2-5
install_github("xlconnect", username = "miraisolutions",
               ref = "0.2-5")

XLConnect is a comprehensive and cross-platform R package for manipulating Microsoft Excel files from within R. XLConnect differs from other related R packages in that it is completely cross-platform and as such runs under Windows, Unix/Linux and Mac (32- and 64-bit). Moreover, it does not require any installation of Microsoft Excel or any other special drivers to be able to read & write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE).

Posted in R, XLConnect | 3 Comments

XLConnect 0.2-4

Mirai Solutions GmbH (http://www.mirai-solutions.com) is very pleased to announce the release of XLConnect 0.2-4, which is available from CRAN.

This newest release comes along with a number of new features:

  • Ability to read cached cell values. There is a new argument useCachedValues to methods readWorksheet and readNamedRegion. This is especially useful for cells with complex formulas that XLConnect (actually the underlying Apache POI library) cannot evaluate and so far would have resulted in NAs being returned. With this new option, XLConnect can be instructed to read the cached cell values instead of evaluating the cells – which should also result in improved runtime performance.
  • There are new arguments keep and drop to readWorksheet and readNamedRegion which allow filtering of columns before the data is imported to R. This can result in much better performance compared to importing the full data to R and then doing the subsetting in R.
  • There is a new vignette called XLConnect for the Impatient which provides a one page overview of XLConnect’s read/write methods – to get new users started quickly.
  • The bounding box detection algorithm for readWorksheet has been reworked to better deal with leading and trailing rows and columns of blank (empty) cells. In addition, new arguments autofitRow and autofitCol have been introduced that allow removal of leading and trailing rows and columns even in case boundaries have been specified by the user. This is useful in situations where the data is expected within certain given boundaries but the exact location is unknown.
  • String conversions when reading in data from Excel now respect the data format of the cells.
  • Support for numeric missing value identifiers has been added. See method setMissingValue for more information.
  • XLConnect 0.2-4 comes along with Apache POI 3.9 (see package dependency XLConnectJars)

A summary of these news can be found in the NEWS file on CRAN.

To see some of the new features in action, please find below some examples:

# Load the XLConnect package
require(XLConnect)

# Download sample data
download.file("http://miraisolutions.files.wordpress.com/2013/01/example.xlsx", destfile = "example.xlsx")

# Load the workbook
wb = loadWorkbook("example.xlsx")

# Read the data on the sheet named "data"
data = readWorksheet(wb, sheet = "data")
# Note all the NA values due to the cells with text containing
# 'some stuff'
print(head(data))

# Let's assume we only roughly know the data is located within some
# given boundaries (exact location would be startRow = 10,
# startCol = 8, endRow = 42, endCol = 20)
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23)
# However, XLConnect is capable to find the actual data region
print(head(data))
# --> This is due to the arguments autofitRow and autofitCol
# which default to TRUE

# Let's see what would happen if we put them to FALSE
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23, autofitRow = FALSE,
                     autofitCol = FALSE)
# --> Well, it reads the region exactly as specified above,
# so we get NAs obviously.

# Going back to the previous version
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23)

# Did you note the column named 'complex' which is all NAs?
# This is due to the fact that the 'complex' column is computed
# using the BESSELJ function in Excel which is not yet implemented
# in Apache POI. However, we can still read the values by instructing
# XLConnect to read the _cached_ values rather than trying to evaluate
# the formula
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23, useCachedValues = TRUE)
print(head(data))

# Let's get on to filtering. Let's assume we are only interested
# in the columns "car", "mpg" and "wt". Generally, we could
# do something like this:
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23, useCachedValues = TRUE)
dataSub = data[, c("car", "mpg", "wt")]
print(head(dataSub))
# What we do here is import ALL the data and subset the data
# in R. This is fine and no problem for small data sets. For
# large data sets, however, there might be a performance penalty.
# It may be better to the subsetting on the Java side directly:
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23, useCachedValues = TRUE,
                     keep = c("car", "mpg", "wt"))
print(head(data))

# Similarly, instead of 'keeping' columns, you can also drop
# them:
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23, useCachedValues = TRUE,
                     drop = c("disp", "hp", "drat", "carb"))
print(head(data))

# Now on to missing values. Noted the values 999999 in column
# "gear"? We might would want those to read in as NAs. How can
# we achieve that?

# Set the missing value identifier
setMissingValue(wb, value = 999999)
# Read in the data
data = readWorksheet(wb, sheet = "data", startRow = 7, startCol = 7,
                     endRow = 46, endCol = 23, useCachedValues = TRUE)
# Here we go ...
print(head(data))

XLConnect is a comprehensive and cross-platform R package for manipulating Microsoft Excel files from within R. XLConnect differs from other related R packages in that it is completely cross-platform and as such runs under Windows, Unix/Linux and Mac (32- and 64-bit). Moreover, it does not require any installation of Microsoft Excel or any other special drivers to be able to read & write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE).

The package can easily be installed from CRAN via install.packages("XLConnect") (use install.packages("XLConnect", type = "source") on Mac OS X). In order to get started have a look at the XLConnect and XLConnect for the Impatient package vignettes, the numerous demos available via demo(package = "XLConnect") or browse through the comprehensive reference manual.

We are always happy to hear from you! For general feedback write us an email to xlconnect@mirai-solutions.com. If you find any bugs or hope to see some currently missing features in the future, just send us an email to xlconnect-bugs@mirai-solutions.com. You may also want to check out our XLConnect FAQ.

Posted in R, XLConnect | 6 Comments

XLConnect 0.2-0

Mirai Solutions GmbH (http://www.mirai-solutions.com) is very pleased to announce the release of XLConnect 0.2-0, which can be found at CRAN.

As one of the updates, XLConnect has moved to the newest release of Apache POI: 3.8. Also, the lazy evaluation issues with S4 generics are now fixed: generic methods now fully expand the argument list in order to have the arguments immediately evaluated.
Furthermore, we have added an XLConnect.R script file to the top level library directory, which contains all code examples presented in the vignette, so that it’s easier to reuse the code.

In addition, we introduced the following improvements to existing XLConnect methods:

  • readWorksheet now returns a named list if more than one worksheet is read. This provides convenient access to the corresponding data by list subsetting.
  • readWorksheet and readNamedRegion now provide a new argument named check.names that controls whether data.frame column names are checked for syntactically valid variable names.

Inspired by some great feedback we received from XLConnect users, we have introduced the following new features to the package:

  • Support for setting the color of worksheet tabs via the setSheetColor method (only for .xlsx files)
  • Support for freeze panes via the createFreezePane method
  • Support for split panes via the createSplitPane method

A summary of news in XLConnect 0.2-0 can be found in the NEWS file on CRAN.

The following is a simple example of how the new XLConnect 0.2-0 methods can be used to produce colored tabs, freeze panes and split panes:

# Loading XLConnect package
require(XLConnect)

# Creating a new workbook
wb <- loadWorkbook("testwb.xlsx", create = TRUE)

# Creating a new sheet called "Red Sheet"
createSheet(wb, name = "Red Sheet")

# Setting the sheet tab color to red
setSheetColor(wb, "Red Sheet", XLC$COLOR.RED)

# Creating a new sheet called "Green Sheet"
createSheet(wb, name = "Green Sheet")

# Setting the sheet tab color to green
setSheetColor(wb, "Green Sheet", XLC$COLOR.GREEN)

# Writing the mtcars data frame on the Red Sheet, with header and row names
writeWorksheet (wb, data=mtcars, sheet="Red Sheet", header = TRUE, rownames="Cars")

# Autosizing the first column in the Red Sheet
setColumnWidth(wb, sheet="Red Sheet", column=1)

# Writing the USJudgeRatings data frame on the Green Sheet, with header and row names
writeWorksheet (wb, data=USJudgeRatings, sheet="Green Sheet", header = TRUE, rownames="Judge")

# Autosizing the first column in the Green Sheet
setColumnWidth(wb, sheet="Green Sheet", column=1)

# Creating a freeze pane in the Red Sheet. The top row and the leftmost column are frozen.
createFreezePane(wb, "Red Sheet", "B", 2)

# Creating a split pane on the Green Sheet, with coordinates (5000, 5000) expressed as 1/20th of a point,
# "J" as the left column visible in the right pane and 10 as the top row visible in the bottom pane 
createSplitPane(wb, "Green Sheet", 5000, 5000, "J", 10)

# Saving the workbook
saveWorkbook(wb)

We would also like to mention that we were very pleased to discover that XLConnect has been mentioned in two books about R programming:

  1. R for Dummies by Joris Meys and Andrie de Vries, page 211
  2. Grundlagen der Datenanalyse mit R: Eine anwendungsorientierte Einführung by Daniel Wollschläger, page 513

XLConnect is a comprehensive and cross-platform R package for manipulating Microsoft Excel files from within R. XLConnect differs from other related R packages in that it is completely cross-platform and as such runs under Windows, Unix/Linux and Mac (32- and 64-bit). Moreover, it does not require any installation of Microsoft Excel or any other special drivers to be able to read & write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE).

The package can easily be installed from CRAN via install.packages("XLConnect"). In order to get started have a look at the package vignette, the numerous demos available via demo(package = "XLConnect") or browse through the comprehensive reference manual.

We are always happy to hear from you! For general feedback write us an email to xlconnect@mirai-solutions.com. If you find any bugs or hope to see some currently missing features in the future, just send us an email to xlconnect-bugs@mirai-solutions.com. You may also want to check out our XLConnect FAQ.

Posted in R, XLConnect | 18 Comments

XLConnect 0.1-8

Mirai Solutions GmbH (http://www.mirai-solutions.com) is pleased to announce the availability of XLConnect 0.1-8 on CRAN.

The most prominent new features/extensions are an improved package vignette which provides more information and examples to get people started with the package. A new type conversion mechanism was introduced which allows users to pre-specify column types/classes and even force conversions between types (e.g. from character to numeric) when reading in data using readNamedRegion or readWorksheet. Instead of relying completely on cell types (which is the default mechanism), this allows to specify column classes of the data to be read in upfront. Possible type conversion issues will be reported back to the user/developer via warnings referring to the actual cell that is causing problems. Another new feature that was introduced is the function xlcEdit. It allows to edit plain data.frames in an Excel file editor (e.g. MS Excel). Changes will be applied to the underlying data object after closing the editor. Note that this functionality is only available under Windows and Mac OS X. For more information please see the reference manual.

A summary of news in XLConnect 0.1-8 can be found in the NEWS file on CRAN.

The following is a simple example showing the type conversion mechanism introduced with XLConnect 0.1-8:

# Load the XLConnect package
require(XLConnect)

# conversion xlsx file from demoFiles subfolder of package XLConnect
excelFile <- system.file("demoFiles/conversion.xlsx", package = "XLConnect")

# Load workbook
wb <- loadWorkbook(excelFile)

# Read named region 'conversion' with pre-specified column types
# Note: in the worksheet all data was entered as strings!
# forceConversion = TRUE is used to force conversion from String
# into the less generic data types numeric, POSIXt & logical
df <- readNamedRegion(wb, name = "conversion", header = TRUE,
		colTypes = c("numeric", "POSIXt", "logical"),
		forceConversion = TRUE,
		dateTimeFormat = "%Y-%m-%d %H:%M:%S")

# print classes
print(sapply(df, class))

XLConnect is a comprehensive and cross-platform R package for manipulating Microsoft Excel files from within R. XLConnect differs from other related R packages in that it is completely cross-platform and as such runs under Windows, Unix/Linux and Mac (32- and 64-bit). Moreover, it does not require any installation of Microsoft Excel or any other special drivers to be able to read & write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE).

The package can easily be installed from CRAN via install.packages("XLConnect"). In order to get started have a look at the package vignette, the numerous demos available via demo(package = "XLConnect") or browse through the comprehensive reference manual.

We are always happy to hear from you! For general feedback write us an email to xlconnect@mirai-solutions.com. If you find any bugs or hope to see some currently missing features in the future, just send us an email to xlconnect-bugs@mirai-solutions.com. You may also want to check out our XLConnect FAQ.

Posted in R, Uncategorized, XLConnect | 6 Comments

XLConnect 0.1-7

Mirai Solutions GmbH (http://www.mirai-solutions.com) is pleased to announce the availability of XLConnect 0.1-7. This release includes a number of improvements and new features:

  • Performance improvements when writing large xlsx files
  • New workbook data extraction & replacement operators [, [<-, [[, [[<-. These are "syntactic sugar" for the already existing methods readWorksheet, writeWorksheet, readNamedRegion and writeNamedRegion.
  • Append data to named regions and worksheets: appendNamedRegion & appendWorksheet
  • Dumping and restoring data.frames to/from Excel files (xlcDump, xlcRestore)
  • Added a package vignette
  • Support for saving workbooks to a new file ("save as")
  • Support for row names
  • Extended use of setMissingValue. setMissingValue allows to define a set of missing value identifiers that are recognized as missing values when reading in data. The first element of this set is used as missing value identifier when writing data.
  • Auto-sizing of columns

A full listing of changes can be found here.

The following is a simple example showing the use of the new extraction/replacement operators introduced with XLConnect 0.1-7:

# Load the XLConnect package
require(XLConnect)

# Load workbook (create if not existing)
wb <- loadWorkbook("extraction.xlsx", create = TRUE)

# Write mtcars data set on a worksheet named ’mtcars1’.
# Note: The ’mtcars1’ sheet will be created automatically if it does
# not exist yet. Also, default values for other writeWorksheet arguments
# hold, i.e. the data set is written starting at the top left corner.
wb["mtcars1"] = mtcars

# Write mtcars data set on a worksheet named ’mtcars2’.
# Again, the ’mtcars2’ worksheet is created automatically.
# Additionally specify arguments passed to the underlying method
# writeWorksheet.
wb["mtcars2", startRow = 6, startCol = 11, header = FALSE] = mtcars

# Read worksheets ’mtcars1’ and ’mtcars2’.
# Note: The default arguments hold for the underlying method
# readWorksheet.
wb["mtcars1"]
wb["mtcars2"]

# Write mtcars data set to a named region named ’mtcars3’. Since
# it doesn’t exist yet we also need to specify the formula to
# define it. Also note that the sheet ’mtcars3’ referenced in the
# formula does not yet exist – it will be created automatically!
# Moreover, default values for other writeNamedRegion arguments hold.
wb[["mtcars3", "mtcars3!$B$7"]] = mtcars

# Redefine named region ’mtcars3’. Note that no formula specification
# is required since named region is already defined (see above example).
wb[["mtcars3"]] = mtcars

# Write mtcars data set to a named region ’mtcars4’. Since the named
# region does not yet exist a formula specification is required. Also,
# additional arguments are specified that are passed to the underlying
# method writeNamedRegion.
wb[["mtcars4", "mtcars4!$D$8", rownames = "Car"]] = mtcars

# Read the named regions ’mtcars3’ and ’mtcars4’.
# Note: Default values hold for the underlying method readNamedRegion.
wb[["mtcars3"]]
wb[["mtcars4"]]

# Save workbook (this actually writes the file to disk)
saveWorkbook(wb)

XLConnect is a comprehensive and cross-platform R package for manipulating Microsoft Excel files from within R. XLConnect differs from other related R packages in that it is completely cross-platform and as such runs under Windows, Unix/Linux and Mac (32- and 64-bit). Moreover, it does not require any installation of Microsoft Excel or any other special drivers to be able to read & write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE).

The package can easily be installed from CRAN via install.packages("XLConnect"). In order to get started have a look at the numerous demos available via demo(package = "XLConnect") or have a look into the reference manual.

We are always happy to hear from you! For general feedback write us an email to xlconnect@mirai-solutions.com. If you find any bugs or hope to see some currently missing features in the future, just send us an email to xlconnect-bugs@mirai-solutions.com. You may also want to check out our new XLConnect FAQ.

Posted in R, XLConnect | 2 Comments

XLConnect – A platform-independent interface to Excel

XLConnect is a comprehensive and platform-independent R package for manipulating Microsoft Excel files from within R. XLConnect differs from other related R packages in that it is completely cross-platform and as such runs under Windows, Unix/Linux and Mac (32- and 64-bit). Moreover, it does not require any installation of Microsoft Excel or any other special drivers to be able to read & write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE). Also, XLConnect can deal with the old *.xls (BIFF) and the new *.xlsx (Office Open XML) file formats. Under the hood, XLConnect uses Apache POI (http://poi.apache.org) – a Java API to manipulate Microsoft Office documents.

XLConnect‘s main features include (as of version 0.1-5):

  • Reading & writing of worksheets (via data.frames)
  • Reading & writing of named regions/ranges (via data.frames)
  • Creating/removing/renaming/cloning worksheets
  • Adding graphics
  • (Named) cellstyles: data formats, borders, back- and foreground fill color, fill pattern, text wrapping
  • Controlling sheet visibility
  • Defining column width and row height
  • Merging/unmerging cells
  • Setting/getting cell formulas
  • Defining formula recalculation behavior (when workbooks are opened)
  • Setting auto-filters
  • Style actions: controlling application of cell styles when writing (e.g. when using templates)
  • Defining behavior when error cells are encountered

In order to make the XLConnect package a bit more accessible, this post will run through an example whose use case is the generation of a very simple Excel report showing the recent development of currency exchange rates (EUR, USD, GBP, JPY vs Swiss Franc (CHF)). Note that this post does not intend to give any investment advise!

##################################################
# Installation of Packages Required by this Demo #
##################################################

# XLConnect: Excel Connector for R (that's what this demo is about)

# fImport: Rmetrics - Economical and Financial Data Import
# forecast: Forecasting functions for time series
# zoo: S3 Infrastructure for Regular and Irregular Time Series
# ggplot2: An implementation of the Grammar of Graphics

install.packages(c("XLConnect", "fImport", "forecast", "zoo", "ggplot2"))

#################
# Load Packages #
#################

require(XLConnect)
require(fImport)
require(forecast)
require(zoo)
require(ggplot2)

####################
# Data Preparation #
####################

# Currencies we're interested in compared to CHF
currencies = c("EUR", "USD", "GBP", "JPY")

# Fetch currency exchange rates (currency to CHF) from OANDA (last 366 days)
curr = do.call("cbind", args = lapply(currencies,
               function(cur) oandaSeries(paste(cur, "CHF", sep = "/"))))
# Make a copy for later use
curr.orig = curr
# Scale currencies to exchange rate on first day in the series (baseline)
curr = curr * matrix(1/curr[1,], nrow = nrow(curr),
                     ncol = ncol(curr), byrow = TRUE) - 1
# Some data transformations to bring the data into a simple data.frame
curr = transform(curr, Time = time(curr)@Data)
names(curr) = c(currencies, "Time")
# Cyclic shift to bring the Time column to the front
curr = curr[(seq(along = curr) - 2) %% ncol(curr) + 1]

# Let's do some predictions ...

# Number of days to predict
predictDays = 20
# For each currency ...
currFit = sapply(curr[, -1], function(cur) {
  as.numeric(forecast(cur, h = predictDays)$mean)
  # Note: Normally one would obviously do some model diagnostics first...
})
# Add Time column to predictions
currFit = cbind(
  Time = seq(from = curr[nrow(curr), "Time"],
             length.out = predictDays + 1, by = "days")[-1],
  as.data.frame(currFit))

# Bind actual data with predictions
curr = rbind(curr, currFit)

###########################################
# Writing Data to Excel / Excel Reporting #
###########################################

## Let's start simple!

# Workbook filename
wbFilename = "swiss_franc.xlsx"
# Create a new workbook
wb = loadWorkbook(wbFilename, create = TRUE)

# Create a new sheet named 'Swiss_Franc'
sheet = "Swiss_Franc"
createSheet(wb, name = sheet)
# Alternatively, in a more object-oriented style:
# wb$createSheet(name = sheet)

# Create a new Excel name referring to the top left corner
# of the sheet 'Swiss_Franc' - this name is going to hold
# our currency data
dataName = "currency"
createName(wb, name = dataName, formula = paste(sheet, "$A$1", sep = "!"))
# Alternatively:
# wb$createName(name = dataName, formula = paste(sheet, "$A$1", sep = "!"))

# Write the currency data to the named region created above
# Note: the named region will be automatically redefined to encompass all
# written data
writeNamedRegion(wb, data = curr, name = dataName, header = TRUE)
# Alternatively:
# wb$writeNamedRegion(data = curr, name = dataName, header = TRUE)

# Save the workbook (this actually writes the file to disk)
saveWorkbook(wb)

# Just dumping out some data to Excel is a very common case.
# Therefore, there is a simpler way of doing this:
if(file.exists(wbFilename))
  file.remove(wbFilename)
writeNamedRegionToFile(wbFilename, data = curr, name = dataName,
                       formula = paste(sheet, "$A$1", sep = "!"),
                       header = TRUE)
# Note: writeNamedRegionToFile automatically generates the required sheet
# and Excel name!

## In the following steps we will make the currency report a bit nicer...

# In addition to writing out the data, for each currency we want to
# highlight the points in time when there was a change of more than 2%
# compared to the previous day.

# Load the workbook created above
wb = loadWorkbook(wbFilename)

# Create a date cell style with a custom format for the Time column
# (only show year, month and day without any time fields)
csDate = createCellStyle(wb, name = "date")
setDataFormat(csDate, format = "yyyy-mm-dd")
# Create a time/date cell style for the prediction records
csPrediction = createCellStyle(wb, name = "prediction")
setDataFormat(csPrediction, format = "yyyy-mm-dd")
setFillPattern(csPrediction, fill = XLC$FILL.SOLID_FOREGROUND)
setFillForegroundColor(csPrediction, color = XLC$COLOR.GREY_25_PERCENT)
# Create a percentage cell style
# Number format: 2 digits after decimal point
csPercentage = createCellStyle(wb, name = "currency")
setDataFormat(csPercentage, format = "0.00%")
# Create a highlighting cell style
csHlight = createCellStyle(wb, name = "highlight")
setFillPattern(csHlight, fill = XLC$FILL.SOLID_FOREGROUND)
setFillForegroundColor(csHlight, color = XLC$COLOR.CORNFLOWER_BLUE)
setDataFormat(csHlight, format = "0.00%")

# Index for all rows except header row
allRows = seq(length = nrow(curr)) + 1

# Apply date cell style to the Time column
setCellStyle(wb, sheet = sheet, row = allRows, col = 1, cellstyle = csDate)
# Set column width such that the full date column is visible
setColumnWidth(wb, sheet = sheet, column = 1, width = 2800)
# Apply prediction cell style
setCellStyle(wb, sheet = sheet, row = tail(allRows, n = predictDays), col = 1,
             cellstyle = csPrediction)
# Apply number format to the currency columns
currencyColumns = seq(along = currencies) + 1
for(col in currencyColumns) {
  setCellStyle(wb, sheet = sheet, row = allRows, col = col,
               cellstyle = csPercentage)
}

# Check if there was a change of more than 2% compared to the previous day
# (per currency)
idx = rollapply(curr.orig, width = 2, FUN = function(x) abs(x[2] / x[1] - 1),
                by.column = TRUE) > 0.02
idx = rbind(rep(FALSE, ncol(idx)), idx)
widx = apply(as.data.frame(idx), which)
# Apply highlighting cell style
for(i in seq(along = currencies)) {
  if(length(widx[[i]]) > 0) {
    setCellStyle(wb, sheet = sheet, row = widx[[i]] + 1, col = i + 1,
                 cellstyle = csHlight)
  }
  # Note:
  # +1 for row since there is a header row
  # +1 for column since the first column is the time column
}

saveWorkbook(wb)

## In a next step we are going to add a currency graph to our report

wb = loadWorkbook(wbFilename)

# Stack currencies into a currency variable (for use with ggplot2 below)
gcurr = reshape(curr, varying = currencies, direction = "long",
                v.names = "Value", times = currencies, timevar = "Currency")
# Also add a discriminator column to differentiate between actual and
# prediction values
gcurr[["Type"]] = ifelse(gcurr$Time %in% currFit$Time, "prediction", "actual")

# Create a png graph showing the currencies in the context of the Swiss Franc
png(filename = "swiss_franc.png", width = 800, height = 600)
ggplot(gcurr, aes(Time, Value, colour = Currency, linetype = Type)) +
  geom_line() + stat_smooth(method = "loess") + xlab("") +
  scale_y_continuous("Change to baseline", formatter = "percent") +
  opts(title = "Currencies vs Swiss Franc",
       axis.title.y = theme_text(size = 10, angle = 90, vjust = 0.3))
dev.off()

# Define where the image should be placed via a named region;
# let's put the image two columns left to the data starting in the 5th row
createName(wb, name = "graph",
           formula = paste(sheet, idx2cref(c(5, ncol(curr) + 2)), sep = "!"))
# Note: idx2cref converts indices (row, col) to Excel cell references

# Put the image created above at the corresponding location
addImage(wb, filename = "swiss_franc.png", name = "graph",
         originalSize = TRUE)

# As a short side-note: summary also works on workbooks!
summary(wb)

saveWorkbook(wb)

## Want to add formulas to your reports? Here we go...

wb = loadWorkbook(wbFilename)

# Let's add the maximum downturn (= minimum) fur the currencies at the bottom
# of the data (offset by three rows)

# Get reference coordinates (corners) of the named region containing our data
corners = getReferenceCoordinates(wb, dataName)
# Excel area references to calculate the average for
areaRefs = idx2aref(t(sapply(currencyColumns,
                    function(col) c(corners[1,1] + 1, col,
                                    corners[2,1], col))))
# Note: corners[1,1] + 1 as we do not want to include the header row in
# the formula

# Construct the required formulas
avg = paste("MIN(", areaRefs,  ")", sep = "")

# Set cell formulas
setCellFormula(wb, sheet = sheet, row = corners[2,1] + 3,
               col = currencyColumns, formula = avg)
# Add some cell styling...
csAvg = createCellStyle(wb, name = "min")
setDataFormat(csAvg, format = "0.00%")
setBorder(csAvg, side = "top", type = XLC$BORDER.MEDIUM,
          color = XLC$COLOR.BLACK)
setCellStyle(wb, sheet = sheet, row = corners[2,1] + 3,
             col = currencyColumns, cellstyle = csAvg)

# And, as we are already at it - let's also add an Excel auto-filter to
# the columns
setAutoFilter(wb, sheet = sheet, reference = aref(corners[1,], corners[2,]))

saveWorkbook(wb)

# Depending on the size and complexity of reports to create, starting from
# scratch and coding everything in R can be cumbersome (especially cell styling).
# Therefore, XLConnect supports working with templates. The application of cell
# styles when writing data to Excel can be controlled via so-called "style
# actions" (see function setStyleAction).

#######################################
# Reading / Importing Data from Excel #
#######################################

# Reading (rectangular) data from Excel is straight forward.
# There is basically two ways:
# - reading data from worksheets via readWorksheet
# - reading data from named regions via readNamedRegion

# Generally, reading from named regions is the simplest and preferred way but
# requires a named region to be set up correspondingly. If this is not the case,
# readWorksheet can be used.

## Reading from named regions

wb = loadWorkbook(wbFilename)

# Read the named region that was created in the section above
data = readNamedRegion(wb, name = dataName, header = TRUE)
# Alternatively: wb$readNamedRegion(name = dataName, header = TRUE)
# Do some calculations ...
colMeans(data[, currencyColumns])

## Reading from worksheets

data = readWorksheet(wb, sheet = sheet, header = TRUE)
# Note: When not specifying any of the boundaries (startRow, startCol, endRow,
# endCol) XLConnect will try to determine the boundaries automatically.
# If there are several data regions on a worksheet, this may not give the desired
# result and you may need to specify some boundaries.
# Actually, in our example the above won't give the desired result as the data
# read in also includes the mean values computed via the Excel formulas.
tail(data)

# Therefore we need to do the following
data = readWorksheet(wb, sheet = sheet, startRow = -1, endRow = nrow(curr) + 1,
                     startCol = -1, endCol = -1, header = TRUE)
tail(data)
# Note: -1 means that the correspondingly values will be automatically
# determined. Currently it is still necessary to specify either none or all
# boundaries. This may be changed in the future to simplify the application of
# readWorksheet and to allow something like:
# data = readWorksheet(wb, sheet = sheet, endRow = nrow(curr) + 1,
#                      header = TRUE)

# Also note that for simple cases there are the functions
# readNamedRegionFromFile and readWorksheetFromFile that prevent you having to
# go via the sequence loadWorkbook & readNamedRegion/readWorksheet:
data = readNamedRegionFromFile(wbFilename, name = dataName, header = TRUE)

data = readWorksheetFromFile(wbFilename, sheet = sheet, startRow = -1,
                             endRow = nrow(curr + 1), startCol = -1,
                             endCol = -1, header = TRUE)

## Some additional syntactic sugar:

# In case a workbook contains only a few named data regions, you can use 'with'
# to reference them directly (via the name of the named region) without having
# to read them in manually first:
with(wb, {
  colMeans(currency[, currencyColumns])
})

# Note: 'with' should only be used if the number of named regions in the workbook
# is small, since 'with' basically reads in all named regions automatically.

###########################
# Handling of Error Cells #
###########################

# XLConnect supports two different schemes on how to deal with error cells
# (cells that produce an error when trying to be read):
# - immediately stop on the first occurrence on an error cell
# - produce warnings for erroring cells and assign NA as their value

# Read example workbook containing various forms of error cells
errorCell <- system.file("demoFiles/errorCell.xlsx", package = "XLConnect")
wb = loadWorkbook(errorCell)

# Define to produce warnings on erroring cells and assume error cells to be NA
# (this is the default behavior)
onErrorCell(wb, XLC$ERROR.WARN)
# Read named region containing error cells
data = readNamedRegion(wb, name = "MyData")
print(data)

# Define to stop immediately when encountering error cells
onErrorCell(wb, XLC$ERROR.STOP)
# Read named region containing error cells
data = readNamedRegion(wb, name = "MyData")

We hope that the above example succeeded in giving a better understanding of XLConnect and its capabilities. The package can easily be installed from CRAN via install.packages(“XLConnect”). You may also want to have a look at the numerous demos available via demo(package = “XLConnect”). E.g. in order to run the “writeNamedRegion” demo simply run demo(topic = “writeNamedRegion”, package = “XLConnect”).

We are always glad to receive feedback! For general feedback write us an email to xlconnect@mirai-solutions.com. If you find any bugs or hope to see some currently missing features in the future, just send us an email to xlconnect-bugs@mirai-solutions.com. You may also want to check out our XLConnect FAQ.

Further references & resources:

Posted in R, XLConnect | 3 Comments

XLConnect 0.1-5

Mirai Solutions GmbH (http://www.mirai-solutions.com) is pleased to announce the availability of XLConnect 0.1-5. This release adds the following new features:

  • Support for setting/getting cell formulas. See methods set/getCellFormula.
  • Support for setting/getting the force formula recalculation flag on worksheets. See methods set/getForceFormulaRecalculation.
  • Support for setting auto-filters via the setAutoFilter method.
  • There is a set of new utility functions:
    • aref2idx, idx2aref: Converts Excel area references (such as A5:D14) to row & column based cell references and vice versa.
    • aref: Constructs an Excel area reference from a specification of the top left corner and the dimensionality of an object.
    • getReferenceCoordinates: Queries the coordinates of an Excel name representing an Excel area reference (such as A5:D14)

See the following simple example on using cell formulas with XLConnect 0.1-5:

# Load the XLConnect package
require(XLConnect)

# Load workbook (create if not existing)
wb <- loadWorkbook("setCellFormula.xlsx", create = TRUE)

# Create a sheet named 'mtcars'
createSheet(wb, name = "mtcars")

# Create a named region called 'mtcars' referring to the sheet
# called 'mtcars'
createName(wb, name = "mtcars", formula = "mtcars!$A$1")

# Write built-in data set 'mtcars' to the above defined named region.
writeNamedRegion(wb, mtcars, name = "mtcars")

# Now, let us get Excel to calculate average weights.
# Where did we write the dataset?
corners <- getReferenceCoordinates(wb, "mtcars")
# Put the average under the wt column
colIndex <- which(names(mtcars) == "wt")
rowIndex <- corners[2,1] + 1

# Construct the input range & formula
input <- paste(idx2cref(c(corners[1,1], colIndex,
			  corners[2,1], colIndex)), collapse=":")
formula <- paste("AVERAGE(", input, ")", sep="")

setCellFormula(wb, "mtcars", rowIndex, colIndex, formula)

# Save workbook (this actually writes the file to disk)
saveWorkbook(wb)

XLConnect is a comprehensive and cross-platform R package for manipulating Microsoft Excel files from within R. XLConnect differs from other related R packages in that it is completely cross-platform and as such runs under Windows, Unix/Linux and Mac (32- and 64-bit). Moreover, it does not require any installation of Microsoft Excel or any other special drivers to be able to read & write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE).

The package can easily be installed from CRAN via install.packages(“XLConnect”). In order to get started have a look at the numerous demos available via demo(package = “XLConnect”) or have a look into the reference manual.

We would be glad to hear about how you use XLConnect. Do you mainly use it for data import/export or do you e.g. use it for reporting purposes? What are the features you most often make use of? Your feedback will help us to further improve XLConnect. For general feedback write us an email to xlconnect@mirai-solutions.com. If you find any bugs or hope to see some currently missing features in the future, just send us an email to xlconnect-bugs@mirai-solutions.com. You may also want to check out our new XLConnect FAQ.

Posted in R, XLConnect | Leave a comment