All posts by Richard

How to Use PageSetup Options When Saving to a PDF Document

When saving Excel worksheets to a PDF document using the PDF rendering extension methods introduced in OfficeWriter 10.0, it is often useful to be able to specify details about the resulting PDF document, or about how the rendering should behave.  This is achieved by setting properties on the worksheet’s PageSetup property before calling the SavePdf method.  This tutorial will walk through that process.

Setting up your worksheet

For this example, we will open an existing workbook, in order to save the first worksheet of the workbook to a PDF document:

ExcelApplication xla = new ExcelApplication();

Workbook WB = xla.Open(“input.xlsx”);

Worksheet worksheetToSave = WB[0];

 Specifying page properties

 Using the worksheet’s PageSetup property, we can specify the page size, orientation, and margins.  These properties will be reflected in the final PDF document:

worksheetToSave.PageSetup.PaperSize = PageSetup.PagePaperSize.Legal;

worksheetToSave.PageSetup.Orientation = PageSetup.PageOrientation.Landscape;

worksheetToSave.PageSetup.TopMargin = 1.5; // Specified in inches

Setting a header and footer

The PageSetup property can also be used to set a header or footer.  The header and footer will be printed on each page of the PDF:

HeaderFooterSection.Section hfSection = HeaderFooterSection.Section.Center;

worksheetToSave.PageSetup.GetHeader(hfSection).SetContent(“Header text”);

Specifying rendering options

Other PageSetup properties are useful for specifying how the content should be rendered to the PDF document.  For example, we can make the content smaller than normal by using the Zoom property, specify the order in which pages should appear in the PDF document, and print all of the cell comments at the end of the document by setting the relevant properties on the worksheet:

worksheetToSave.PageSetup.Zoom = 50; // Specified as a percentage

worksheetToSave.PageSetup.UseZoom = true;

worksheetToSave.PageSetup.PrintOrder = PageSetup.PagePrintOrder.DownThenOver;

worksheetToSave.PageSetup.PrintComments = true;

worksheetToSave.PageSetup.PrintCommentsAtEnd = true;

There are additional PageSetup properties that may prove useful; for a complete list, see the PageSetup documentation.

Saving the PDF document

Once you have set all desired PageSetup options, you can then save the worksheet to a PDF document:


If you want to save multiple PDF documents from the same workbook, but with different options set, you can repeat this process.  For example, after saving one PDF document, you could then change the PaperSize property to a different size of paper, and then save a second document.  The updated properties will be reflected in any subsequent calls to SavePdf.

When saving an entire workbook to a PDF, each worksheet is rendered using its own PageSetup properties.

Save PDF file to HttpResponse

In a previous blog post we discussed how OfficeWriter 10.0 introduced the ability to save an Excel workbook to a PDF document. When working in a web environment it is common to want to send the generated file to the browser for your end user to download and view on their own machine.

Step 1:
Generate your workbook. A very simple example might be:

var xla = new ExcelApplication();
var wb = xla.Create(ExcelApplication.FileFormat.Xlsx);
var ws = wb[0];
ws.Cells[0, 0].Value = "Hello";
ws.Cells[0, 1].Value = "World!";

Step 2:
Define a helper method to write the file byte to the current response stream:

public static void WriteFileToResponse
(HttpContext context, byte[] bytes, string fileName)

var bytesLength = bytes.Length.ToString(CultureInfo.InvariantCulture);
var response = context.Response;
response.Buffer = true;
response.AddHeader("Content-Length", bytesLength);
response.AddHeader("Content-Disposition", "attachment; filename=" + fileName);
response.ContentType = MimeMapping.GetMimeMapping(fileName);

Step 3:
Save the PDF to a memory stream and call our helper method we just defined. This has the benefit of avoiding disk IO. This may vary if your application actually needs to persist the generated PDF.

using (var memoryStream = new MemoryStream())
var fileName = "generatedfile.pdf";
wb.SavePdf(false, memoryStream);
memoryStream.Seek(0, SeekOrigin.Begin);
WriteFileToResponse(HttpContext.Current, memoryStream.ToArray(), fileName);

And that’s it!

How to Save an Excel Workbook to a PDF Document

OfficeWriter 10.0 introduces the ability to save a Workbook, Worksheet, or Area to a PDF document. This makes it possible to produce a searchable, vector-format rendering of your spreadsheet.

Using the Imaging Extension DLL

The PDF functionality is included in the rendering extensions DLL (SoftArtisans.OfficeWriter.ExcelWriter.Imaging.dll). The first thing you will need to do is include this DLL as a reference in your project in Visual Studio. You will also need to tell the compiler to use the imaging namespace in your source file. This can be accomplished by adding a using statement to the top of the file where you want to save a PDF document:

using SoftArtisans.OfficeWriter.ExcelWriter.Imaging;

Setting up your workbook

In order to save an Excel workbook to a PDF document, you first need a workbook with contents in it. For this example, let’s create a simple workbook with three worksheets:

ExcelApplication xla = new ExcelApplication();
Workbook WB = xla.Create(ExcelApplication.FileFormat.Xlsx);
Worksheet ws0 = WB[0];
Worksheet ws1 = WB.Worksheets.CreateWorksheet("Sheet2");
Worksheet ws2 = WB.Worksheets.CreateWorksheet("Sheet3");

ws0[0, 0].Value = “Sheet 1, Cell A1”;
ws1[0, 0].Value = “Sheet 2, Cell A1”;
ws2[0, 0].Value = “Sheet 3, Cell A1”;

For this example, we are just exporting some cells with text in them. However, the rendering extensions support more dynamic content as well, such as cell formatting, charts, images, comments, or conditional formats. We could also use a workbook that we opened from a file, that already had contents and formatting applied.

Saving a PDF document

There are three ways to save a PDF document: through the workbook, through a worksheet, or through a specific area. Specific PDF rendering options can be specified by setting a worksheet’s PageSetup properties; this will be covered in a later tutorial. If you have not set any of the worksheet’s PageSetup properties, then default settings will be used.

You can save multiple PDF files from one workbook. First, let’s save multiple worksheets to a single PDF document. This can be achieved by using the Workbook.SavePdf method. The first parameter is a Boolean; if this is set to true, then only selected worksheets will be saved. Otherwise all visible worksheets will be saved.

For this example, lets save the first and the third worksheet to a single PDF document. Continuing from the code above, we can achieve this with the following two lines of code:

WB.Worksheets.Select(new object[]{0, 2});
WB.SavePdf(true, “MultipleWorksheets.pdf”);

We can also save the remaining worksheet to a separate PDF file containing only the contents of that worksheet:


This allows to either generate multiple PDF documents, each containing one worksheet, or a single PDF document containing multiple sheets.

Calls to SavePdf must be made before saving a workbook to an xlsx/xlsm file. Once you have exported all of the PDF documents that you want, you can then also save the workbook to an Excel file as you normally would:

xla.Save(WB, “workbook.xlsx”);

Putting it all together

Here’s what the final code looks like:

using SoftArtisans.OfficeWriter.ExcelWriter;
using SoftArtisans.OfficeWriter.ExcelWriter.Imaging;

public class SampleProgram
public static void Main(string[] args)
ExcelApplication xla = new ExcelApplication();
Workbook WB = xla.Create(ExcelApplication.FileFormat.Xlsx);

// Set up the workbook with some content
Worksheet ws0 = WB[0];
Worksheet ws1 = WB.Worksheets.CreateWorksheet(“Sheet2”);
Worksheet ws2 = WB.Worksheets.CreateWorksheet(“Sheet3”);
ws0[0, 0].Value = “Sheet 1, Cell A1”;
ws1[0, 0].Value = “Sheet 2, Cell A1”;
ws2[0, 0].Value = “Sheet 3, Cell A1”;

// Select two worksheets and save them to a single PDF document
WB.Worksheets.Select(new object[]{0, 2});

// The ‘true’ argument tells the rendering method to only save worksheets
// that are currently selected. If it were ‘false’, then all worksheets
// would be saved to the PDF.
WB.SavePdf(true, “MultipleWorksheets.pdf”);

// Save one worksheet to a separate PDF

// Finally, save the workbook to an Excel file in case we need to edit it later
xla.Save(WB, “workbook.xlsx”);

Carpe Datum: How to Export Your GMail to Excel


[Crossposted from Riparian Data]

Straightforward title, straightforward goal, ugly and roundabout (but free!) method of achieving it.

For some time now, I’ve had this goal: download my gmail data, analyze it, and visualize it.

The last time I tried this, I glossed over the whole getting your gmail data into Excel part. This is because I wasn’t able to do all of it myself–Jim had to take my ugly mbox data and make it Excel-readable.

But now, thanks to the basic python skills acquired in my data science class, I can do everything myself! Kinda. The code in part 3 will probably make a real programmer scream, but for the most part, it works–though it’s not fond of commas in subject lines. And if you, like me, are not a programmer–don’t worry! You can still run the code, using my trusty copy/paste/pray methodology.

Alors, here goes:

Step 1: From Gmail to Apple Mail

You have Apple mail, right?  You can also do this with Outlook, and probably other desktop clients.

1) In your Gmail settings, go to the “Forwarding and POP/IMAP tab” and make sure POP is enabled.

2) Now, add your Gmail account to your desktop client o’choice. If it’s already there, add it again–you’re going to be removing this one.

Important: Do not check the “remove copy from server after retrieving a message” box!

Step 2: From Apple Mail to mbox

This part is easy. Just select your mailbox in the desktop client, and go to Mailbox->Export Mailbox, and choose a destination folder.

Step 3: From mbox to csv

If you try to save your pristine mbox file as a csv, you will get a one column csv. Don’t do that. Instead, use these python scripts (also up on github).

The first script opens a blank csv file, and fills it with the subject, from, and date lines for each message in your mbox. I called it

import mailbox import csv
writer = csv.writer(open("clean_mail.csv", "wb")) for message in mailbox.mbox('your_mbox_name'):     writer.writerow([message['subject'], message['from'], message['date']])

If you don’t know what python is, you can still run this script. Here’s how:

1) copy the above code to a plain text file, and save it as Save it to the same folder you saved your mbox file to.

2) open your terminal (spotlight–>terminal)

3) type cd Users/your_account_name/directory_where_you_saved_your_mbox,

4) type  python

5) Voila! In your directory, you should see a new file, cleaner.csv.

You’ll notice that the ‘date’ column is a long, jam-packed date string. It’ll be much easier to Continue reading Carpe Datum: How to Export Your GMail to Excel

Stuff Tech Blogs Do That Bother Me

[cross posted from Riparian Data]

Credit: Business InsiderSome people thought Gourmet’s demise was a nail in good journalism’s coffin. Others said no, it’s just another sign that the web is the future of journalism, good and bad. Today, the consensus seems to be that the latter group was right. And, happily, there is quite a bit of good journalism on the web. Short form, long form, data-based, image-based, crowd-sourced… all can be found, relished, and easily shared.

Unhappily, there is also quite a bit of drecky journalism on the web. I can’t tell you if technology really does take up a lion’s share of drecky journalism in general, or just a lion’s share of the drecky journalism I read. Regardless, there’s an awful lot of it, fueled by both the traffic-winner-takes-all maxim and tech companies’ willingness to stroke the egos of tech reporters in exchange for headlines. The following 12 tics are the icing on my insufferable cake. If you have any of your own, or just want to tell me to shove it and stop reading these sites if I despise them so much, feel free to let me know in the comments!

1. Slideshows. Especially slideshows that are one image/page. If gddamn Buzzfeed doesn’t use them, you don’t have to.

2. Attributions listed below the post. This is shady and shoddy journalism, for it at best de-emphasizes and worse obfuscates the source. (1)

3. Headlines that are two sentences of keywords, strung together with a minimum of prepositions.

4. Headlines that follow this formula: [adjective] data startup [startup name] lands/gets $[number] Million in Series A/B/C to disrupt [noble cause like social network for cats] market.

5. Headlines that follow this formula: “I’m quitting/Why I quit [currently cusping or widely-used technology]”

Continue reading Stuff Tech Blogs Do That Bother Me

#Undertime BostonFest Contest: Fight Email Overload, Win a POP Phone

*Cross-posted from Riparian Data, a startup incubated out of SoftArtisans.

What’s the single best part of your workday? Work night? Weekend? It’s checking your inbox, right? Oh. It’s not? But you spend 13 hours a week dealing with email! You and your ‘box should be besties by now!

Pardon? “Enemies?” Did you really say that?

Way harsh, Ty.

But, seeing as it’s your work/life and I’m just interrupting it, Imma throw you a bone. Or, rather, a Pop phone.

Here’s what you need to do to get it:

Continue reading #Undertime BostonFest Contest: Fight Email Overload, Win a POP Phone

Spring/Summer 2012 Conference Wishlist: DevCon, SQLBits, TechEd, and More

[Image via Huffington Post]

It’s 81 degrees out right now. In March. In Boston. We’re racing the kayaks out on the Charles this afternoon. El Niño, te amo. Nice weather tends to give me Magellan syndrome, and what better way to (semi) productively harness that than by researching cool sessions at upcoming conferences? The following were selected for their edgy subject matter. You may translate “edgy” any way you like.

I would love to see a conference devoted to SQL Azure Labs projects like Data Explorer, but that aside, I think this list sums up my current MSFT-related interests. If I’ve skipped yours, do chime in in the comments!

  • Topic: BI Reports
  • Conference: DevConnections
  • Session: Data Visualization Choices
  • By: Paul Turley (b | t)
  • Hook: Get the scoop behind Reporting Services, PowerPivot, Tabular Semantic Models, Report Builder, BIDS, SharePoint, PerformancePoint, Excel, Excel Services and the new Power View reporting tool.
  • Where: Las Vegas, NV
  • When: 3/26-3/29 (better book those plane tickets now!)
  • How much: $1595
  • Topic: Data Quality Services and Master Data Services
  • Conference: SQLBits X
  • Session: Take Good care of your Data: Introducing SQL 2012 DQS &MDS
  • By: David Faibish (in)
  • Hook: “The new exciting Data Quality Services and the improved Master Data Services in conjunction with SSIS provides the IT and IW with an attractive solution that allows full lifecycle data management.”
  • Where: Novotel London West, London
  • When: 3/29 (yep, same “buy those tickets stat” admonishment!)
  • How much: £350.00
  • Topic: Azure
  • Event: Windows Azure Kickstart
  • Hook: “You have chance to spend a day with some of the nation’s leading cloud experts and to learn how to build a web application that runs in Windows Azure. “
  • By: Microsoft Azure team
  • Where: Minneapolis, Independence, Columbus, Overland Park, Omaha, Mason, Southfield, Houston, Creve Coeur, Downder’s Grove, Franklin, and Chicago
  • When: 3/30-5/8 (check the listing for your city’s date)
  • How much: Free
  • Topic: Hadoop on Azure
  • Conference: Hadoop Summit
  • Session: Unleash Insights on All Data with Microsoft Big Data*
  • By: Alexander Stojanovic (b| t)
  • Hook: “Accelerate your analytics with a Hadoop service that offers deep integration with Microsoft BI and the ability to enrich your models with publicly available data from outside your firewall. “
  • Where: San Jose Convention Center
  • When: 6/13-6/14
  • How much: $499 (until 3/31), $599 (4/1-6/3), $700 (on-site)
  • Topic: Exchange 2010 SP2
  • Conference: TechEd Europe
  • Session: Microsoft Exchange Server 2010 SP2: In’s and Out’s
  • By: Exchange Product Team
  • Hook: “Come and see how we have tamed this beast and turned it into something even your own mother could understand.”
  • Where: Amsterdam RAI
  • When: 6/26-6/29
  • How much: €1,695 + VAT (before 3/31), €1,995 + VAT (after 3/31)


*Sessions for Hadoop Summit are selected by the community. Results coming soon.

Stories from the WIT Trenches: Ann Millspaugh

[This is the seventh in a series of posts exploring the personal stories of real women in technology. Every woman in tech overcame at the very last statistical odds to be here; this blog series aims to find out why, and what they found along the way. Like a number of our interviewees, Ann Millspaugh (t|ln) entered the tech world after college. In less than two years, the former Luddite went from reluctant Drupal admin to passionate advocate of STEM education for girls. She’s currently co-organizer of the Columbia Heights Community Wireless Network and the Online Community Manager for the EdLab Group. If reading her story inspires you to share yours, please feel to email me.]

1)      Can you take us back to your “eureka!” moment—a particular instance or event that got you interested in technology?

To be honest, I don’t think I can claim to be a “woman in technology”. At best, I’m a woman learning technology, and probably more importantly, how to think about technology. For a lot of people, especially “Millennials” and “digital natives,” there’s something almost noble about being adverse to technology – there’s an attitude that “I haven’t submitted myself to this trend yet” or “I’m grounding myself outside of this consumer-driven society.” I’m not saying this as a condescending outsider – I used to feel that way.

Do I feel like I’m now a tech guru who is going to invent the next Linux system? No. But, I do feel like I can be a contributor, and for me, that feeling of empowerment is critical to the way people use and adapt to technology. It’s not about seeing technology as old or new, good or bad, but comprehensively seeing technology for what it is– the resources creating the product, the labor assembling the product, the ingenuity and creativity in software development, and the behavioral trends in the actual usage of these products rather than a cold, static piece of materialism. For me, it’s been fascinating to begin thinking about technology as a tool to improve, analyze and assess behavioral patterns, and that’s what began to get me interested in technology.

2)      Growing up, did you have any preconceived perceptions of the tech world and the kinds of people who lived in it?

Yes, I undoubtedly had preconceptions about the tech world. I started out as one of those people who saw technology as an inhibitor of real-world interaction. Computers were draining, for those anti-social types. I was pretty extreme – I even had a phase in college where I refused to be in pictures because I thought they were too distracting. I think technology can be seen this way – as a way to be self-indulgent and unnecessarily inconvenienced, a byproduct of a consumer-driven society.

It becomes an either-or: either I’m a technology person or I’m not. I think it’s important to realize that just because you don’t dream about coding or you don’t want to eat, sleep, and breathe at a computer doesn’t mean you can’t enjoy computer science. Somehow technology never enters into a realm of moderation; it’s a binary of hacking 24/7 or waiting in line for the Geek Squad. Science and technology fields are like any career – there are people who are obsessed, but there are also plenty of people who live a balanced life.

3)      When did you first start working with tech? Was it by choice?

I was always interested in writing, and over the course of several jobs, realized that writing (as well as many of the arts) is now completely intertwined with technology; it’s almost impossible to pursue those fields with having at least a basic technological background. For me, it was a begrudgingly slow progression over to the tech-side. But, that mindset ultimately came from a lack of understanding. For example, I’ve always liked learning languages, and learning HTML and CSS was just like learning another language. It never occurred to me that the skills I already had could be translated into a STEM field, and that I would actually like it!

4)      Did you experience any personal or systemic setbacks at any point of your academic or professional career?

Like I said before, I started working with technology by accident –I never saw myself as someone interested in technology, or even particularly apt in technology. In fact, when I was in college, computer science classes were at the bottom on my list, for no particular reason except for my perceptions about computer science. I read an interesting book: Unlocking the Clubhouse: Women in Computer Science, that talked about the implicit socialization processes that drive women away from CS, and technology at large (having a computer in your son’s room versus your daughter’s room; taking your son to fix the car with you). These small actions create superficial gender associations that build and become a heavily weighted reality over time. In a lot of ways I feel like the epitome of those socialization processes – I was never bad at science or math, and in retrospect, I’d have to say it was the accumulation of unconscious decisions and stereotypes that drew me away from the field. I would say that was my biggest setback, that I didn’t explore the field until after college.

5)      Whom do you look to as mentors and/or sources of inspiration in your field?

The open source development communities have been incredibly inspiring to me. Everyone is so authentically collaborative: people work together for the sole purpose of making software easier and more accessible to people – for free. And most people do this in their spare time! You can post a question and have a response with seconds, find tutorials and rank suggestions. It’s this incredible network that continually expands through connective curiosity; you rarely see anyone pitching their company or bragging about their latest contribution. There’s a “we want to keep making this better” attitude that drives people to recognize how much more powerful collaboration is than siloed, individual production. No copyrights here!

6)      Why do you think the rate of attrition for women in software engineering is higher than that of women in most other tech fields?

The perception of computer science and programming. There are lots of studies that women tend to be more emotionally-driven; technology, particularly software engineering, can have the perception of being cold, isolating, and distant from immediate applicability. I think it’s important to stop thinking about technology as a new, revolutionary entity. In my opinion, technology doesn’t revolutionize the way people behave. Fundamentally, people want the same things they’ve wanted for hundreds of years – to communicate, connect, and understand – and technology enables these things to happen at an increasingly accelerated rate. If we start to think about technology through this lens, I think many more people, men and women, will be drawn to the field.

7)      Do you have any suggestions for how to get more girls interested in computers and computer science? Is this important to you?

Hopefully by now, it’s evident that yes – this is important to me! Working with the EdLab Group, I’ve been reading and researching how to make STEM fields more appealing to girls. There are a lot of ways to pursue this, one of the most cited examples being that girls enjoy contextualizing information in real-world examples. Rather than solving for a variable in an algorithm, ask girls, “How can this algorithm be applied to make Georgia’s healthcare system more efficient?”

While this is a successful strategy, I also think attributing certain characteristics to STEM competency can be a slippery slope. Bart Franke, a teacher at the Chicago Laboratory High School who boasts a female enrollment of 50% in his computer science classes, recently gave a presentation about his success, citing, “I teach girls, I don’t teach to girls.” As soon as you make distinctions as a woman, a minority, a socio-economically disadvantaged person, etc… you cause people to self-identify in a way that can perpetuate certain stereotypes. Even though gender, ethnicity or socio-economic status is undoubtedly a significant individual and collective characteristic, there are times where this emphasis is appropriate and then there are times where it’s irrelevant and distracting.

How to Export a SharePoint List to Word Using Word Export Plus

We asked EMC’s Paul Forsthoff (b|t) to give us his honest opinion of OfficeWriter’s Word Export Plus solution. IOHO, he did a masterful job. The full review is available on his Everything SharePoint blog.

I recently had the opportunity to check out SoftArtisans OfficeWriter product. The OfficeWriter product exposes an API that allows information from custom ASP.NET applications to be consumed and used to dynamically and programmatically build Microsoft Word documents and Microsoft Excel spreadsheets.

The OfficeWriter API is a .NET library that allows you to read, manipulate and generate Microsoft Word and Microsoft Excel documents from your own applications. The OfficeWriter product can integrate with Sharepoint 2010 allowing you to export Sharepoint list data into Microsoft Word and Excel documents.

SoftArtisans provides easy to understand sample code, videos and pre-built Sharepoint solutions that make getting started with the product very trivial.

For this tutorial I’ll demonstrate deploying, configuring and testing Word Export Plus in a Sharepoint 2010 environment. Word Export Plus is a SharePoint solution that demonstrates the usage of the OfficeWriter API in SharePoint 2010. This solution adds a new context menu (custom action) button to list items, allowing you to export the list data to a pre-formatted Word template that can be designed yourself in Word, or automatically generated by Word Export Plus. [Read more…]

Boston’s Big Datascape, Part 3: StreamBase, Attivio, InsightSquared, Paradigm4, Localytics

[Excerpted from the Riparian Data blog]

This ongoing series examines some of the key, exciting players in Boston’s emerging Big Data arena. The companies I’m highlighting differ in growth stages, target markets and revenue models, but converge around their belief that the data is the castle, and their tools the keys. You can read about the first ten companies here and here.

11) StreamBase

  • Products: StreamBase Complex Event Processing Platform lets you build applications for analyzing real-time streaming data alongside historical data. StreamBase LiveView adds an in-memory data warehouse and a BI front-end to the equation, essentially giving you live (well, a few milliseconds behind) BI.
  • Founder: Richard Tibbetts (t |ln), Michael Stonebraker
  • Technologies used: Complex Event Processing, StreamSQL, cloud storage, pattern-matching, in-memory data warehouse, end-user query interface
  • Target Industries: Capital Markets, Intelligence and Security, MMO, Internet and Mobile Commerce, Telecomunications and Networking
  • Location: Lexington, MA

[read the full post at the Riparian Data blog]