Hi friends, thanks for all your support.You are always my encouragement to write new content.
Seeing the response I decided to take leap forward and start composing video tutorials.
I am yet to create a hand full of them.But I am confident your subscriptions and like will be my fuel to create
more feature rich content.
If you feel it is worthy please subscribe my video channel below.
Generating PPT from Java using Apache POI is cumbersome .On top of that creating an PPT from the scratch is pain and the end result will not be the same as intended.This is because apache POI libs have limitations and cannot reproduce content 1-1 to modern ppt templates.
Luckily, there is a way to this.All you need to know is few concepts around how exactly any office document is created and little bit of marshalling/un-marshalling of xml using POI.The advantage of this is you don’t have to create the entire PPT again which time consuming and not the actual goal.
The actual goal is to add dynamics to the existing ppt using that as the base(template).
Our template is fun fruit chart for xyz person. His/Her consumption of fruits per day displayed in a pir chart.
Note the place holder texts here and the pie chart.We are going to replace this actual data in out output ppt.
Secret Inside office doc
Any office doc is a collection of media and bunch of xmls.So any document you are viewing is nothing but a representation of this xmls and media! If we get a good grip of how to navigate and correlate this xml tags then we have the power to do any modifications we want.
Rename a ppt to zip.Unzip it.You will find below folder structure
our interest is on the ppt folder, hop into that
Marshalling and Un-marshalling
Simply to put we need to read the xml into java object (un-marshalling) and then stream it back as xml(marshalling)
So by now I think you have guessed how we are going to go about here, read the xmls from the template ppt using POI,modify the objects live in the input stream and then marshall it back to output stream and save the output ppt with a name and date.
xml post mortem
Lets tear apart the xmls and see the places we need to updated dynamically
Got to slides folder, you will find the list of slides there.You can use my example slide to experiment
of course there are lot of other stuffs too, which we don’t bother now. All we need to do is replace the placeholder
Note on Slide Id
Each slide have a Slide Id, useful when you have specific slide to edit or delete.Look into this file to find the correct Id against the Slide.Note it is not necessary the Slide Id will be ascending/descending order.
Chart Object are little different.You will find it under
the data inside the chart is referred from the embedded excel
the chart object reference this embedded workbook for data, so we are going first update the excel and recreate this chart object and marshall it back to the slide
Load all the pointers, the template ppt, output ppt, embedded excel,any logo image
Load the data, I have hard-coded it here for demo.Probably your datasource will be existing application/db
Lets take care of the place holder text first.We can traverse to dept of textbody as describe in the above xmls.From there load the child txBody as string buffer replace the place holder and then marshall it back.Finally put it back to the originals slide object at the same index.
Here we need to load the charObject and replace the piechart object in the chart xml picture above.
Note: we can also open the embedded excel from the powepoint too to have look at the excel structure.
Right CLick any Object graph in ppt and then click edit data.
We may have a logo or image that we need add Dynamically.For example for different Client you may have the same template format but logo of the Client changes.
we store the picture xml in the same way and dynamically add the picture into it
We get a generate ppt like this.
Note: Its take 3-4 odd secs to generate 1 slide.It should be fine if your application offload the report generation process as a batch job.
At an initial look this might sound like a lot of tweaking.But once you get hold of this few components.You can replicate to almost all powerpoint objects.Post me here if you face any issue or have questions.Thanks for reading
With 11+ years of IT experience with wide range of skill sets. Starting from architecting infrastructure on cloud, building fault tolerant scalable Microservices in Java, Scala, Python, orchestrating their deployment through containers and managing their lifecycle through Devops.