Simply export HTML content to doc / docx Word document even simpler than PDF generation

Simply export HTML content to doc / docx Word document even simpler than PDF generation

You'd like to export a HTML document to a .doc / .docx MS Word file using simple HTML (+PHP or other server-side language)? Sounds quite complicated but is even simpler than PDF export! Learn how it works...

Drupal specific example

In our specific case we had the requirement to export a Drupal View to a .doc file for further offline processing and print.
We decided to use views_data_export.module for that, which is even handy in Drupal to export single content (simply return one result) or sets of results. This documentation explains how to configure a views data export:

The module already provides a .doc export for views which simply sends the following headers
Content-Type: application/msword
Content-disposition: attachment; filename=myfile.doc

and provides HTML template files for the export header, body and footer. We'll make use of them (modify them) later in this blog post.

Anyway I ran into the problem that the exported documents were not configured as I required it. For example I missed:

  • Paper size settings
  • Paper format / orientation settings
  • Paper margin settings
  • Font family settings
  • etc...

As it seems I'm not the only developer looking for solutions in views_data_export.module as ‎‎ shows (where I posted references to related issues).
I'll also post a patch for views_data_export.module there to set these styles as default.

So I started searching for general solutions and especially ways to set these settings via HTML / CSS. And YES, there are good solutions:

General HTML to Word doc know-how

I finally found the following interesting sources describing options for page layout and page sections which you should read to understand how HTML to Word export is working:

Finally I ended up writing my own views-data-export-doc-header.tpl.php-file in my custom template which is attached to this blog post.

It contains the following style settings in the header section to set the document page style as required:

  1. <!--[if gte mso 9]>
  2.      <xml>
  3.        <w:WordDocument>
  4.        <w:View>Print</w:View>
  5.        <w:Zoom>100</w:Zoom>
  6.        <w:DoNotOptimizeForBrowser/>
  7.        </w:WordDocument>
  8.      </xml>
  9.    <![endif]-->
  10.     <style>
  11.       v\:* {behavior:url(#default#VML);}
  12.       o\:* {behavior:url(#default#VML);}
  13.       w\:* {behavior:url(#default#VML);}
  14.       .shape {behavior:url(#default#VML);}
  15.     </style>
  16.     <style>
  17.       @page
  18.       {
  19.         size: 21cm 29.7cm;  /* A4 */
  20.         margin: 1cm 1cm 1cm 1cm;
  21.         mso-page-orientation: portrait;
  22.       }
  23.       @page WordSection1
  24.       {
  25.         mso-title-page: no;
  26.         mso-paper-source:0;
  27.         mso-header-margin: 0;
  28.         mso-footer-margin: 0;
  29.       }
  30.       div.WordSection1 {
  31.         page:WordSection1;
  32.         mso-header-margin: 0;
  33.         mso-footer-margin: 0;
  34.       }
  35.     </style>
in the HTML <head> section.

Furthermore I set the font-family, font-size, etc. in the body css:

  1. body {
  2.     font-family: Arial, Verdana, Helvetica, Courier, sans-serif;
  3.     font-size:10pt;
  4. }

... and YES it works! The pages layout is set as deserved and everything works fine. Try it yourself and simply export HTML documents to Word .doc with our without Drupal / PHP or other languages.

If this blog entry helped you, please leave a comment :)

views-data-export-doc-header.tpl_.php_.txt1.28 KB
views-data-export-doc-footer.tpl_.php_.txt49 Bytes
views_data_export-doc_formatting-1959640-3.patch4.32 KB


body css

Where do I put the body.css?

(No subject)