Stephen Corwin

ZM Financial Systems Online Manual

ZM Financial Systems Inc has several manuals that are in either PDF or Microsoft Word format. This has made it difficult for them to edit and style their content since many of their documents are over 200 pages. I was tasked with the conversion of two of these manuals into a structured database. I started out by analyzing the documents and noted what types of elements where present. There where tables, images, lists, etc. The first document was in PDF. I used the automatic conversion tool that Adobe has provided to export the PDF as a HTML document. This would allow me access to all of the content including what styling was already done as well as basic structure. We decided to use the SilverStripe Content Management System to organize our data. I went through the process of setting up a basic SilverStripe site. This included creating data models that had representational arrays which SilverStripe uses to create a relational database. Using Sublime Text, I searched through the document and found a few patterns. Each section would always begin and end with the same sequences. There was usually a bookmark ID associated as well. I used regular expressions to strip any in-line styling that Adobe has automatically added. This would let me style all the content in a uniform way, later on. Now that the content was purely text and structure, I developed a PHP script that utilized regular expressions to separate out all the sections. I then ran another PHP script to add these sections to the database. I then developed several SilverStripe templates that would access the data and bring it to the presentation layer. I used Syntactically Awesome Style Sheets (SASS) to produce Cascading Style Sheets (CSS) that replicated the design mockup which our graphic designer had produced. Due to the structure of the documents, I used jQuery to produce a navigation menu that would expand and contract to allow the user to maneuver throughout the manual. When clicking on a chapter, I would perform an AJAX request that retrieved the chapter data if necessary. It would then call a method that used jQuery.ScrollTo to scroll through the content to the appropriate section. This produced a nice seamless effect that was easier on the eyes then just jumping to the new section. In my AJAX request, I was caching the results so that after the first request, it would load much faster. During the load time, I also used a loading spinner. The client also wanted a search feature. This search would not only search the current content, but the entire manual and highlight both sections and the words the user had specified. I then repeated the same process for the second document. The website ended up having a nice feel to it. It was both intuitive and functional - exactly what the client wanted. This project was a great opportunity to practice using regular expressions to sift through data. I can see myself using these often in future projects increase productivity. This project serves as a reminder that for a product to be great, it doesn't need to be complex; it just needs to serve its purpose to fullest extent it can.