A Microsoft Word, PDF and Lulu issue – STYLEREF and MERGEFORMAT

Nadderwater Rise Ghost needed to be in paperback. Easy as pie, no worries, I’ll just format it as I usually do, slap on the Table of Contents, Copyrite and ISBN pages, create a paperback cover as a pdf and head on over to Lulu. What could be simpler? Given a fair wind the paperback will be printed, delivered and approve by the time the digital and audio books are launched.

Boy, I was wrong.

You see, it all went relatively smoothly. The front matter was formatted OK. I got an ISBN and added it into the front (checked it thrice). Did the barcode on the back (checked that thrice) and generally got the book in order. The only difference was that this time I used Microsoft Word rather than Libre Office. The principle was the same, only the controls and terms are slightly different. No problems, page numbers are still added and the Table of Contents is updated correctly, too (checked that thrice, also).

I double checked the boundaries, chapter headings and page numbers, skimmed through for any artefacts and exported to PDF. I uploaded to Lulu. Cover accepted, content accepted. I downloaded the final PDF package and skimmed through that, all good. Hit the publish button, bung a copy into my cart and wait for delivery. I’m fine up to this point.

So what was the issue? I’ll show you:

For those of you playing at home, the header on the last page looks completely mangled. Naturally, I went back to the Word doc and had a look. Nope, there was nothing there but the header and page number. I exported the PDF and it, too, had a simple header and page number.

Well, then, it has to be an issue with Lulu, right? Perhaps when they converted my cover and content PDFs to standard format, something went amiss. I looked closer at their final PDF and, sure enough, the mangle is there. Alright, so was this introduced by me or by Lulu? I shot out an email to their help centre to see if they had any ideas about it. The response that came back showed me that they could definitely see the mangle on their side and that, interestingly, the mangle appeared on the PDF uploaded.

This had me bamboozled. I was looking at the PDF I uploaded, and it was fine. Theirs was not. So it looks like a Lulu issue. However, being a software engineer, I come across a lot of ‘it looks like’ problems and one thing I can tell you is that a ‘looks like’ doesn’t mean ‘is’. OK, so how about I have a look at the words in the mangle. It seems to have four main bits. The heading ‘More by Jeremy Tyrrell’, the page number ‘PAGE157’ and two curious tokens ‘STYLEREF’ and ‘MERGEFORMAT’ with a /* in there.

A quick search on Duck Duck Go shows that these two fields are Microsoft Word fields. Alright, now we’re getting somewhere. The PDF would care about these fields, I reasoned, but MS Word would, so the problem is most likely NOT on Lulu’s end, but on my end. Back to Word.

For S&Gs I re-exported the PDF. The mangle was gone. I opened up the PDF in various viewers. No dice. I re-uploaded the PDF to Lulu, and it appeared. Then I remembered a small gotcha when exporting PDFs for Lulu. Use the option ‘Export as PDF/A’. What’s PDF/A? In a nutshell, it’s a standardised, no-frills, ISO based PDF version that won’t allow superfluous ‘extras’ in there. Think of it as a ‘just the facts, please’ version. When I exported the PDF as this version, suddenly the mangle appeared in my PDF viewer (which was a relief, because I didn’t want to have to export to Lulu every time just to see if I fixed it). To be clear, exporting to PDF/A only revealed the issue, it didn’t fix the issue. My guess is that Lulu’s ingestor also converts incoming PDFs to PDF/A. Rock and roll. The good news was that I had the power to fix the issue and didn’t have to ask for a third party to do it for me.

I deleted the header and re-applied it. I copied the rear-matter again. I tried manually forcing a header on the last page. I revealed all formatting. Nothing worked. I won’t go through all the things I tried to fix it. Just know it was a matter or ‘try something, export it to PDF/A, view it’, rinse, repeat. I’m still not sure what the exact issue is, but the closest I am is ‘Word inserted something in there when I copied the rear-matter from another book and couldn’t get rid of it’.

In the end, I managed to work around the issue by deleting the rear-matter entirely, copying the rear-matter text from another book into notepad ++ (to remove all formatting), then copied that text back in and reapplied the styling. Poof! Now it looks all good.