Android - Implementing DocX Templates

Microsoft Word is, for better or for worse, still an important part of getting work done around an office so it's often the case that we need to build applications that can interoperate with it. Below I'd like to share my approach, based on the fact that since the new docx format is text based we can make changes to MS Word documents without too much trouble and without relying on third party code.

More specifically, the use-case here is where you have a predefined document into which you need to insert text, as in the case of mail-merging.

The Docx File Format

Docx files are, in fact, Zip files. If you rename the file extension of a docx document from ".docx" to ".zip" you'll be able to open the file with any zip utility and see the individual XML files that go into making a docx document.

In our case, the most important file is named "word/document.xml" and is a simple text file containing the core XML that represents the text of our document. As expected, we can view this file in any text editor once we've extracted it from the zip file. If we were to make any changes to this file and compress it back into the Zip file, we should be able to see these changes if we then open the docx file in something like Microsoft Word.

Bookmarks

One way to designate areas, within a docx document, that we want to replace with text is through bookmarks. These can be created via a word-processor and usually mean designating a name to a particular cursor location. If a docx document happened to contain any bookmarks, we would see these as XML similar to the following. To insert text at the bookmark location all we have to do is replace the bookmark tag with an appropriate text tag.

<w:bookmarkStart w:id="0" w:name="date_bookmark"/><w:bookmarkEnd w:id="0"/>

More details regarding the exact XML format can be found here

Zip Files

Fortunately for us, all the APIs we'll need to read and write Zip files are already available and resemble the usual IO stream classes. We can read Zip files with the ZipInputStream class and write them with the ZipOutputStream class. The only new concept when dealing with Zip files is that of an Entry whereby each Entry within a Zip archive represents a single file.

Template Document

The template document used in this example consists of a line of plain text, followed by a table, followed by another line of plain text. There are four bookmarks in total. The first bookmark is named "date_bookmark", is placed at the end of the first line of text, and will be replaced with the current date and time. The second bookmark is named "name_bookmark", is placed on the first row of the table, and will be replaced with a name. The third and fourth bookmarks are named "address_bookmark" and "age_bookmark", are placed on the second and third rows of the table, and will be replaced with the address and age of our fictitious character.

The application assumes that the template document will be placed in the default Download folder (in my case it's "sdcard/Download") and will be named "in.docx".

The template document can be found here and an image is provided below.

Main Activity

The main activity is very simple and consists of a text view onto which the application will display whether or not the document process has succeeded. Processing of the template begins immediately so there is no interactivity with the user. The real work occurs in the "Insert_Text" function (detailed later) and the "Read_Entry" function simply facilitates that. We also need to add the "WRITE_EXTERNAL_STORAGE" permission to our "AndroidManifest.xml" file so that the app can read and write the Docx file. The appropriate permission entry is provided below.

<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
1:  public class DocxActivity  
2:  extends android.app.Activity  
3:  {  
4:   android.widget.TextView text_view;  
5:    
6:   @Override  
7:   public void onCreate(android.os.Bundle savedInstanceState)  
8:   {  
9:    super.onCreate(savedInstanceState);  
10:    
11:    text_view = new android.widget.TextView(this);  
12:    text_view.setGravity(android.view.Gravity.CENTER);  
13:    setContentView(text_view);  
14:    
15:    this.Insert_Text();  
16:   }  
17:    
18:   public void Insert_Text()  
19:   {  
20:    ...  
21:   }  
22:    
23:   public void Read_Entry(java.util.zip.ZipInputStream in_doc, byte[] in_data)  
24:   throws java.io.IOException  
25:   {  
26:    int bytes_read=0, offset=0;  
27:    
28:    do  
29:    {  
30:     offset += bytes_read;  
31:     bytes_read = in_doc.read(in_data, offset, in_data.length - offset);  
32:    }  
33:    while (bytes_read != -1);  
34:   }  
35:  }  

Algorithm

The algorithm is relatively simple and iterates through the various file entries of the Docx Zip file. If the current file entry is the "document.xml" file we need, its bookmarks are replaced with the appropriate text and the resulting text copied into the output Zip file, otherwise the file entry is copied verbatim.

1:   public void Insert_Text()  
2:   {  
3:    java.util.zip.ZipInputStream in_doc;  
4:    java.util.zip.ZipOutputStream out_doc;  
5:    String state, data_str;  
6:    java.io.File download_dir;  
7:    java.util.zip.ZipEntry in_entry, out_entry;  
8:    byte[] in_data, out_data;  
9:    String msg="";  
10:    java.text.DateFormat f;  
11:    
12:    f=java.text.DateFormat.getInstance();  
13:    state = android.os.Environment.getExternalStorageState();  
14:    if (android.os.Environment.MEDIA_MOUNTED.equals(state))  
15:    {  
16:     download_dir = android.os.Environment.getExternalStoragePublicDirectory(android.os.Environment.DIRECTORY_DOWNLOADS);  
17:     download_dir.mkdirs();  
18:    
19:     try  
20:     {  
21:      in_doc = new java.util.zip.ZipInputStream(  
22:       new java.io.FileInputStream(new java.io.File(download_dir, "in.docx")));  
23:      out_doc = new java.util.zip.ZipOutputStream(  
24:       new java.io.FileOutputStream(new java.io.File(download_dir, "out.docx")));  
25:    
26:      for (in_entry = in_doc.getNextEntry(); in_entry != null; in_entry = in_doc.getNextEntry())  
27:      {  
28:       in_data = new byte[(int)in_entry.getSize()];  
29:       this.Read_Entry(in_doc, in_data);  
30:       in_doc.closeEntry();  
31:    
32:       if (in_entry.getName().equals("word/document.xml"))  
33:       {  
34:        data_str = new String(in_data);  
35:        data_str = data_str.replace(  
36:         "<w:bookmarkStart w:id=\"0\" w:name=\"date_bookmark\"/><w:bookmarkEnd w:id=\"0\"/>",  
37:         "<w:r><w:t>"+f.format(new java.util.Date())+"</w:t></w:r>");  
38:        data_str = data_str.replace(  
39:         "<w:bookmarkStart w:id=\"1\" w:name=\"name_bookmark\"/><w:bookmarkEnd w:id=\"1\"/>",  
40:         "<w:r><w:t>Roger Ramjet</w:t></w:r>");  
41:        data_str = data_str.replace(  
42:         "<w:bookmarkStart w:id=\"2\" w:name=\"address_bookmark\"/><w:bookmarkEnd w:id=\"2\"/>",  
43:         "<w:r><w:t>10 Something Street, Sydney, NSW, Australia, Earth</w:t></w:r>");  
44:        data_str = data_str.replace(  
45:         "<w:bookmarkStart w:id=\"3\" w:name=\"age_bookmark\"/><w:bookmarkEnd w:id=\"3\"/>",  
46:         "<w:r><w:t>32</w:t></w:r>");  
47:    
48:        out_data = data_str.getBytes();  
49:       }  
50:       else  
51:        out_data = in_data;  
52:    
53:       out_entry = new java.util.zip.ZipEntry(in_entry);  
54:       out_doc.putNextEntry(out_entry);  
55:       out_doc.write(out_data);  
56:       out_doc.closeEntry();  
57:      }  
58:    
59:      in_doc.close();  
60:      out_doc.close();  
61:      msg = "New document succesfully created!";  
62:     }  
63:     catch (Exception e)  
64:     {  
65:      msg = e.getMessage();  
66:     }  
67:    }  
68:    
69:    this.text_view.setText(msg);  
70:   }  

Output Document

The resulting document should appear in the same Download directory as the template and be named "out.docx". I will assume you already have a suitable application, such as Microsoft Office, with which to view the document. An image of the resulting document is shown below.

Popular Posts