Category Archives: ExcelWriter

Setting PivotTable styles with ExcelApplication PivotTables

Problem

In Excel, there are numerous PivotTable settings available. These can be found by right clicking the PivotTable and selecting PivotTable options. Not all of these are accessible through the ExcelApplication PivotTable API yet.

Solution

For any properties that are not yet available in ExcelApplication, we recommend that you create a template file that contains a PivotTable and set the PivotTable options on that table, rather than creating a PivotTable from scratch.

Then use ExcelApplication to modify the existing PivotTable by changing the data source and adding PivotFields.

The settings will be preserved, even if the data source of the PivotTable is changed and new PivotFields are added.

To get a handle on a PivotTable:

 worksheet.PivotTables[0]; //By index in the worksheet worksheet.PivotTables["PivotTable1"]; //By name 

To change the data source of a PivotTable, use PivotTable.ChangeDataSource.

For more about building PivotTables, see Creating a Basic PivotTable.

How to export SSRS reports to XLSX, DOCX file formats

Problem

In Office 2007, Microsoft introduced the OOXML file formats (XLSX, XLSM), which come with benefits, such as an increase in the number of rows allowed in a file. Reporting Services reports designed for OfficeWriter allow for exports to the Office 2007 and 2003 file formats.

Solution

To export a report in a particular format, you need to explicitly save the report with the format you want to export it as:

  1. Open the report with the OfficeWriter Designer
  2. Click ‘Save As’ on the OfficeWriter Designer toolbar (for Office 2003) or ‘Save’ > ‘Save to disk’ (for Office 2007/2010)
  3. There will be three options for Reporting Definition files: Office 2007(exports to XLSX, DOCX), Office 2007 with macros (exports to XLSM, DOCM), Office 2003 and earlier (*exports to XLS, DOC)
  4. Select Office 2007 to save the RDL
  5. Publish the saved report to the server

Note: Office 2007 (XLSX, DOCX) is only available in OfficeWriter 4.0 and above. Office 2007 with macros (XLSM, DOCM) is only available in OfficeWriter 4.1 and above.

How to select a group of rows that have specific values in a cell or column

Problem

Often it necessary to filter the rows on one worksheet based on a certain criteria and copy over the affected rows to a separate worksheet. It is fairly easy to achieve this result with a little coding.

ExcelApplication object allows you to fully parse and design the Excel document from your code, therefore giving you the ability to provide all the conditional logic to structure your final report.

Solution

Given the following data in one worksheet:

Company Name
PRICES
Date Open High Low Close Volume Adj Close*
5-Jul-06 20.47 20.47 20.13 20.28 336,400 20.28
3-Jul-06 20.97 21.07 20.85 21.05 115,000 21.05
30-Jun-06 21.27 21.27 21.02 21.13 215,700 21.13
29-Jun-06 20.4 21.02 20.38 21.02 314,000 21.02
28-Jun-06 20.37 20.37 20.05 20.14 403,900 20.14
27-Jun-06 20.24 20.24 19.8 19.86 257,700 19.86
26-Jun-06 20.2 20.27 20.05 20.22 703,700 20.22
23-Jun-06 20.45 20.45 20.17 20.2 302,500 20.2
22-Jun-06 20.3 20.5 20.07 20.26 291,700 20.26
21-Jun-06 20.08 20.45 20.04 20.31 160,000 20.31
20-Jun-06 20.12 20.2 20 20.03 278,500 20.03
19-Jun-06 20.33 20.37 20.06 20.11 301,100 20.11
16-Jun-06 20.03 20.31 20.03 20.16 480,800 20.16
15-Jun-06 19.82 20.5 19.82 20.42 301,000 20.42
14-Jun-06 19.4 19.66 19.38 19.61 211,000 19.61
13-Jun-06 19.25 19.53 18.96 19.09 523,700 19.09
12-Jun-06 20.45 20.56 20.11 20.11 239,900 20.11
9-Jun-06 20.16 20.53 20.11 20.22 297,600 20.22
8-Jun-06 19.3 20.21 19.25 20.15 1,641,200 20.15
7-Jun-06 21.35 21.41 21.21 21.29 684,700 21.29
6-Jun-06 22.15 22.15 21.51 21.67 299,000 21.67
5-Jun-06 22.51 22.51 21.89 21.91 293,300 21.91
2-Jun-06 22.63 22.78 22.5 22.7 754,100 22.7
1-Jun-06 21.9 22.24 21.86 22.24 281,200 22.24

**Criteria for selected rows (blue highlight) to copy is volume < 300,000

Result sheet should contain:

3-Jul-06 20.97 21.07 20.85 21.05 115,000 21.05
30-Jun-06 21.27 21.27 21.02 21.13 215,700 21.13
27-Jun-06 20.24 20.24 19.8 19.86 257,700 19.86
22-Jun-06 20.3 20.5 20.07 20.26 291,700 20.26
21-Jun-06 20.08 20.45 20.04 20.31 160,000 20.31
20-Jun-06 20.12 20.2 20 20.03 278,500 20.03
14-Jun-06 19.4 19.66 19.38 19.61 211,000 19.61
12-Jun-06 20.45 20.56 20.11 20.11 239,900 20.11
9-Jun-06 20.16 20.53 20.11 20.22 297,600 20.22
6-Jun-06 22.15 22.15 21.51 21.67 299,000 21.67
5-Jun-06 22.51 22.51 21.89 21.91 293,300 21.91
1-Jun-06 21.9 22.24 21.86 22.24 281,200 22.24

Algorithm

  1. Loop through the cells and evaluate your test expression. In this case the cells to test are in the Volume column.
  2. If the expression evaluates to false, move on to the next row value in our test column.
  3. If the expression evaluates to true, call a helper function CopyWorksheetRow (sample below) to copy the values of current row from original worksheet to the destination worksheet.

Helper Method


/// /// Copies values of up to 20 columns for a given row from the original sheet
/// to the destination sheet
/// /// Worksheet object from which to copy from
/// Worksheet object to which to copy to
/// Row number to copy from the origin to the
/// destination
public static void CopyWorksheetRow(Worksheet origin, Worksheet destination, int row_from)
{
// create an area of 1 row / 20 columns (Note our sample only has 7 columns)
Area selected_row = origin.CreateArea(row_from, 0, 1, 20);
// out of that area extract only the cells
// with values (Note this will select that area of only 7 columns)
Area populated_cells = selected_row.PopulatedCells;
for(int x = 0; x < populated_cells.ColumnCount; x++)
{
destination.Cells[row_to_start_at,x].Value = populated_cells[0, x].Value;
destination.Cells[row_to_start_at,x].Style = populated_cells[0, x].Style;
}
row_to_start_at++;
}

Code to test the condition


ExcelApplication xap = new ExcelApplication();
Workbook wb = xap.Open(Page.MapPath("datadoc.xls"));


Worksheet data_sheet = wb.Worksheets[0];
Worksheet filtered_datasheet = wb.Worksheets.CreateWorksheet("mysheet", 1);


// set up a loop to look throught the cells in column 6 (indexed as 5) from
// rows 6 to 71
for(int x = 5; x < 71; x++)
{
int cellval = int.Parse(data_sheet.Cells[x, 5].Value.ToString());


// if the cells value matches our criteria
// call a function to copy this row onto a separate sheet.
// in my case I want to copy the rows where
// the volume of stock traded is under 300,000
if(cellval < 300000)
{
CopyWorksheetRow(data_sheet, filtered_datasheet, data_sheet.Cells[x, 5].RowNumber);
}
}
//Save to disk on the server
xap.Save(wb, Page.MapPath("C:\\MyReports\\output.xls"));

Saving ExcelWriter and WordWriter files

Problem

What options are available for saving an ExcelWriter or WordWriter generated file?

Solution

ExcelTemplate.SaveExcelApplication.SaveWordTemplate.Save, and WordApplication.Save all have the same four output options:

  • Save to disk – saves the generated file on the server
  • Save to an IO stream – streams the file to the specified IO stream or class derived from System.IO.Stream
  • Stream to the client as an attachment
  • Stream to the client as an inline file – If the user is using Internet Explorer, the file will be opened in the browser using IE’s inline browsing option. Otherwise, the file will be streamed to client.

How to use CopySheet with PivotTables

Problem

When a worksheet with a PivotTable is copied in Excel, the PivotTable data source is not updated. If the worksheet is copied within a workbook, then the data sources of the original and copied PivotTables point to the same area. If the worksheet is copied to a new workbook, the data source of the copied PivotTable will point to the area in the original workbook. For example: “[OldWorkbookName.xlsx]OriginalWorksheet!A1:C10″.

ExcelWriter’s CopySheet follows this behavior. If a worksheet that contains a PivotTable is copied with CopySheet, the data source will not be updated.

Solution

Starting in 8.4, ExcelApplication has the ability to change the data source of a PivotTable. Use PivotTable.ChangeDataSource to update the data source of a copied PivotTable.

Example:

PivotTable pt = ws.PivotTables["PivotTable1"];
Area data_area = ws_new_data.CreateArea("A1:G10");
pt.ChangeDataSource(data_area);

Note

PivotTables are sensitive to the column header names in the data source for a PivotTable. If a column header name is changed, the source field associated with that column will get updated, and any PivotTableFields that were created using that source field will be removed from the PivotTable.

This happens in Excel as well.

When changing the data source of the PivotTable, if the new data source is missing any of the columns from the original data source, then fields created from the missing columns will be removed from the PivotTable.

For example, if PivotTable1 has 3 columns: Col1, Col2, and Col3. Changing the data source to an area that has 3 columns: Col1, Col2, and ColC, will cause any fields that were created from Col3 to be removed from the PivotTable.

How to unmerge a group of cells with ExcelWriter

Problem

Starting in ExcelWriter 6.8.1, the ability to unmerge cells in a worksheet as introduced with the Cell.Unmerge method. The Cell.IsMerged has also been added to determine whether or not a given cell is merged.

This post covers some examples of unmerging an individual cell or an area of cells.

Solution

Unmerging a single cell


Cell cell = wb.Worksheets[0].Cells[0,0];
if(cell.IsMerged) cell.Unmerge();

Unmerging an Area of cells

Although the Merge method is on the Area object, the Unmerge method is only on the Cell object. This is because each cell knows whether it is part of a merged cell, but a given area can be defined to include both merged and unmerged cells.

There are two ways to unmerge an area of cells:

  1. If you know exactly where the merged cells are, you can unmerge any one cell and the entire merged area will be unmerged.
  2. Or, loop through all the cells in an area to find merged cells and unmerge them.

C# example:

Area a = wb.Worksheets[0].CreateArea("A1:C4");
for(int i=0; i<a.RowCount-1; i++)
{
for(int j=0;j<a.ColumnCount-1;j++)
Cell cell = wb.Worksheets[0].Cells[i,j];
if(cell.IsMerged) { cell.Unmerge(); }
}
}

For more information about merging cells, refer to the Area.MergeCells() documentation.

How to hide the secondary axis of a chart

Problem

When programatically adding a secondary x axis to a chart using ExcelApplication, a secondary y axis will automatically appear as well (and vice versa).

Solution

To hide a secondary axis in ASP.NET use the Visible property of the Axis element. Both the SecondaryCatagoryAxis property and the SecondaryValueAxis property, return an object that extends the Axis element.

The following code hides the secondary axes in ASP.NET:

Chart1.SecondaryCategoryAxis.Visible = false; //For the secondary X Axis
Chart1.SecondaryValueAxis.Visible = false; //For the secondary Y Axis

How to designate number formats in ExcelWriter across languages

Problem

While Microsoft Excel is available in different language versions, ExcelWriter is US-English based. What this means is that while ExcelWriter can generate spreadsheets in any language, there is no French, Russian, Chinese, etc. version of ExcelWriter. This requires special considerations when creating number formats for other languages in ExcelWriter.

You must understand what symbols are used as “separators,” “decimal placeholders,” and what remaining symbols will be interpreted as “literals.” This article will help you understand how ExcelWriter does this, and how this will be interpreted by a non-English versions of Excel (example, French, Chinese, Russian, etc). This post will use French as the non-English language example.

Solution

What is a number separator?

A number separator is a symbol or space that is used to group numbers so that they are easier to read. In English(US) and many other languages, separators occur between the thousands position and the hundreds position, and then again for every three numbers moving left of the decimal placeholder. The decimal placeholder may also change from language to language.

Compare these values for English (United States) and French:

Language Separator symbol Decimal Symbol Example using 1234567
English(US) Commas Period 1,234,567.00
French Spaces Comma 1 234 567,00

Specifying Number Formats for non-English Spreadsheets

Since ExcelWriter is only available in US-English, you must specify your number formats according to US-English standards. This will allow ExcelWriter to correctly identify the separators and decimal place holders. When the spreadsheet is opened in a non-English version of Microsoft Excel, those separators and placeholders will be correctly translated according to the language and regional settings in that version of Microsoft Excel.

ExcelWriter code sample


//--- Declare variables
ExcelApplication xla = new ExcelApplication();
Workbook wb = xla.Create();
Worksheet ws = wb.Worksheets[0];


ws.Cells["A1"].Value = 1234567
ws.Cells["A1"].Format.Number = "#,###.##;-#,###.##;;"
...[rest of code]

Here is how Cell A1 will display its number:

Language of Microsoft Excel: How the format will be translated:
English 1,234,567.00
French 1 234 567,00

How to populate XLS files with more than 65536 rows

Problem

This post describes 3 ways to populate a binary Excel template when there are more than 65536 rows in the data source. In order to accommodate a variable number of data rows, these approaches use a combination of ExcelApplication and ExcelTemplate. You can download a Visual Studio 2008 project containing a demo at the end of the article.

This post also discusses an alternative in the case that you do not have access to ExcelApplication.

Solution

For a binary Excel file (.xls), the number of rows of each worksheet is limited to 65536 rows (source: Excel specifications and limits). By comparison, an OOXML file (.xlsx or .xlsm) has a maximum of about 1 million (exactly 1048576) rows per worksheet. If the data source contains more than 65536 rows (and fewer than 1 million rows) and you are using ExcelWriter version 7.0 or higher, you can use ExcelTemplate with an OOXML template so that all data fit into a single worksheet.

If you want to use a binary template and the data source contains more than 65536 rows, the template must contain as many worksheets as necessary to accommodate all data rows. In order to generate the required number of worksheets, you can start with a template containing a single worksheet, then make copies of the original worksheet. Alternatively, you can start with the maximum number of worksheets required, then delete worksheets as necessary.

In this post, we use a binary template containing a single worksheet. Then we use ExcelApplication to make copies of the original worksheet so that the processed template contains exactly the number of worksheets required to accommodate all data rows. Depending on the approach, we will also need to modify the data marker(s) on each copy. Finally, we use ExcelTemplate to populate the processed template.

Support for OOXML files in ExcelApplication was introduced in ExcelWriter 8. In ExcelWriter v7.6.1 and earlier, ExcelApplication only supports binary files. If you are using an older version of ExcelWriter, you will not be able to dynamically insert or delete worksheets with an OOXML file.

If you cannot use ExcelApplication, you cannot dynamically insert or remove worksheets. In this case, you can have multiple templates, each with a different number of worksheets. See Using ExcelTemplate only.

As of ExceWriter 7.5, the following table summarizes the possible solutions for different scenarios. Note that more options would be available for ExcelWriter EE once ExcelApplication supports the OOXML format in a future version.

ExcelWriter edition ExcelWriter version Template format Available solution(s)
EE All Binary 1234
EE 7.0-7.6.1 OOXML Populate all data on single worksheet, 4
SE All Binary 4
SE 7.0+ OOXML Populate all data on single worksheet, 4

Using ExcelApplication and ExcelTemplate

1. Populate using a DataReader

When you use a forward-only data source such as a DataReader, the template must contain identical data markers on each worksheet. After the number of rows on a given worksheet reaches a limit, additional rows automatically overflow onto the next worksheet. This limit is specified in the MaxRows property of the DataBindingProperties parameter of the ExcelTemplate.BindData() method.

DataBindingProperties props = excelTemplate.CreateDataBindingProperties();
props.MaxRows = 65536;
excelTemplate.BindData(dataReader, "", props);

For illustration purpose, the above code snippet sets MaxRows to the maximum number of rows allowable on a worksheet. However, if the worksheet contains header or blank rows in addition to data rows, adjust this value accordingly.

If the template has more worksheets containing data markers than necessary to accommodate all rows, ExcelTemplate would attempt to populate the extra worksheets after DataReader reaches its end and consequently throw the following exception: SoftArtisans.OfficeWriter.ExcelWriter.SAException: Exhausted data marker at XX value: %%=YY.ZZ.

Note for ExcelWriter 3.9.x

In ExcelWriter 3.9.x, ExcelTemplate does not support the BindData method. Replace the data-binding code above with:

excelTemplate.SetDataSource(dataReader, "", 65536);

2. Populate using the (continue) data marker modifier

This approach is applicable when you use a scrollable data source such as a DataTable. The data markers on the overflow worksheets must have a (continue) modifier; e.g., %%=datasource.field1(continue). Except for the (continue) modifier, the data markers are otherwise the identical. In order to create the required number of worksheets, we use ExcelApplication to make copies of the first worksheet, then append (continue) to all data markers on each copy. See Data Marker Modifiers.

The (continue) modifier indicates that ExcelTemplate should continue to read the data source at the point where the previous worksheet leaves off. Again, set the DataBindingProperties.MaxRows property to limit the number of rows per worksheet.

DataBindingProperties props = excelTemplate.CreateDataBindingProperties();
props.MaxRows = 65536;
excelTemplate.BindData(dataTable, "", props);

For illustration purpose, the above code snippet sets MaxRows to the maximum number of rows allowable on a worksheet. However, if the worksheet contains header or blank rows in addition to data rows, adjust this value accordingly.

If the template has more worksheets containing data markers than necessary to accommodate all rows, the DataTable would rewind after reaching the last worksheet and the extra worksheets would contain repeated rows.

Note for ExcelWriter 3.9.x

In ExcelWriter 3.9.x, ExcelTemplate does not support the BindData method. Replace the data-binding code above with:

excelTemplate.SetDataSource(dataTable, "", 65536);

3. Populate individual worksheets

You must bind data to each individual worksheet using a different data source. Use the DataBindingProperties.WorksheetName property to specify the name of the worksheet you are targeting. Each worksheet’s data source must have only the rows you want on that worksheet. Because each worksheet has a different data source, it’s possible to have a different number of rows on each worksheet.

DataBindingProperties props = excelTemplate.CreateDataBindingProperties();
props.WorksheetName = workbook.Worksheets[0].Name;
excelTemplate.BindData(dataTable1, "datasource1", props);

The data markers on each worksheet must have a data source identifier matching its data source. If you use ordinal data source identifier, the first worksheet to be data-bound should have data markers like %%=#1.field1, the second data-bound worksheet, %%=#2.field2, and so on. If you use named data source identifier, the data marker should contain the name of the data source specified in ExcelTemplate.BindData. For example, if the data-binding call is BindData(dataTable, “datasourceN”, dataBindingProperties), the corresponding data markers should be %%=datasourceN.field1%%=datasourceN.field2, and so on.

If you start with a single data source, you can partition it into smaller data sources containing non-overlapping blocks of rows. In the sample, we partition the original DataTable into a set of smaller DataTables, each of which contains 65536 rows (or whichever value the row limit is set to). The last DataTable gets the remaining rows and can have fewer than 65536 rows. Each smaller DataTable is used as the data source for a different worksheet.

If you bind the same data source to more than one worksheet, even under different names, ExcelTemplate will return the following error: This binding source named xxx was already added under the name yyy.

Note for ExcelWriter 3.9.x

This approach is not compatible with ExcelWriter 3.9.x because it isn’t possible to bind data to a specific worksheet.

Using ExcelTemplate only

The following discussion applies to the following scenarios:

  • You do not have access to ExcelApplication
  • You are unable to use ExcelApplication with an OOXML template

4. Using multiple templates

It may occur to you to use a template with as many worksheets as necessary and append the (optional) modifier to all data markers on each worksheet. The assumption is that ExcelTemplate would ignore such data markers if there are no data bound to a worksheet. However, this assumption is not correct. The (optional) modifier has to do with the presence or absence of a column, not a row. ExcelTemplate ignores a data marker marked as optional if there is no corresponding column in the data source. But if a data marker is mapped to a column in the data source, ExcelTemplate would attempt to populate all instances of such a data marker. If ExcelTemplateencounters a data marker and there is no data row left, its behavior depends on the type of data source. For a forward-only data source such as DataReader, ExcelTemplate would throw an “exhausted data marker” error. For a scrollable data source such as DataTable, ExcelTemplate would rewind the data source to the beginning, resulting in duplicate data.

Consequently, it’s imperative that the template contains exactly the number of worksheets required to accommodate all data rows. If you cannot use ExcelApplication, you would not be able to dynamically insert or remove worksheets at run time. An alternative is to create multiple templates, each one containing a different number of worksheets. You can determine which template to use depending on the number of rows in the data source.

Conclusions

Using DataReader for data retrieval is the simplest approach and has performance benefits. Using the (continue) data marker modifier involves performing additional processing on a worksheet after copying it and therefore is less efficient. Binding data to individual worksheets, because the data source must be processed, can have a lower performance than the other approaches.

About the demo

The attachment contains a Visual Studio 2008 project illustrating the 3 approaches described in this article. Each of the approaches is contained in a separate method. You must install ExcelWriter EE before running the demo.

Attachments

Using Report Models as DataSources in OfficeWriter Designer

Problem

Prior to version 3.9.2, OfficeWriter designer did not support reports with queries based on Report Models. Publishing such a report from OfficeWriter Designer caused some fields to show gibberish or not show at all.

For more information about Report Models visit Microsoft web-site at http://technet.microsoft.com/en-us/library/cc678411.aspx.

Solution

From version 3.9.2 and above, OfficeWriter supports Report Model DataSources. Reports created using Visual Studio (2005 and 2008) or ReportBuilder version 2.0, that use a report model as a datasource are now parsed correctly by the OfficeWriter designer.

Note: Support for newer versions of Reporting Services (2008, 2008 R2) and Report Builder (3.0) have been added in later version of OfficeWriter. For more information, refer to the change log in the OfficeWriter Docs.

Reports may contain both regular SQL queries and semantic queries (Report Model-based queries). This support is seamless to the user with no change to the toolbar. However, a feature was added to improve the user experience.

In Report Models, all fields are mapped to entities (instead of tables). We added a feature that allows the user to know to which entity a certain field belongs. For example, if the user encounters a field First Name, it will have a way of knowing whether this is an employee or a customer first name. When a report model is the datasource, its fields will be displayed as a submenu of their entities (rather than the usual flat list of fields), as in the image below:

Figure 1.

Below is the same query, but in the unmapped view:

Figure 2.

The RDL alone does not contain sufficient information to map the fields to entities. Therefore, when a Report Model-based query is selected from the Select Query drop down, we attempt to retrieve the model (as an XML file) from the server. If the address to the model on the server is embedded in the RDL, we try to retrieve it immediately. If we are not able to get the model, or if there is no address embedded in the RDL, the following window will open:

Figure 3.

This dialog offers the following options:

  • Continue will continue without retrieving the model. This will not affect the functionality of the report, the only effect is that the fields will not be mapped to entities, but will be displayed in a single list (as in Figure 2).
  • Retry will attempt to retrieve the model from the server specified in the RDL if the address exists.
  • Browse… will open a window (as in Figure 4 below) that will allow the user to browse to a server and choose a model from a list of available models on the server.

Figure 4.

Note that each model has a unique ID which is not visible to the user. Two models, even if they are created based on the same database with the same entities and fields, will have different IDs. Therefore, mapping the fields of a report created with one model against another model will fail. However, each model preserves its ID when deployed to different servers, so fields can be mapped using a model from another server if it was deployed from the same source model.

Once the model is retrieved, the fields are mapped to entities under the Insert Fields drop-down list (as in Figure 1). Formulas created in Visual Studio or ReportBuider are not related to any entity and will show as regular fields at the bottom of the Insert Field drop-down.

This change applies to the client-side OfficeWriter designer. The server-side components of OfficeWriter require no special modification to work with Report Models. However, we recommend always using matching versions of the designer and the server-side installation of OfficeWriter.