pdf - How to handle non-ASCII Characters in Java while using PDPageContentStream/PDDocument -


i using pdfbox create pdf web application. web application built in java , uses jsf. takes content web based form , puts contents pdf document.

example: user fill inputtextarea (jsf tag) in form , converted pdf. unable handle non-ascii characters.

how should handle non-ascii characters or atleast strip them out before putting on pdf. please me suggestions or point me resources. thanks!

since you're using jsf on jsp instead of facelets (which implicitly using utf-8), following steps avoid platform default charset being used (which iso-8859-1, wrong choice handling of majority of "non-ascii" characters):

  1. add following line top of jsps:

    <%@ page pageencoding="utf-8" %> 

    this sets response encoding utf-8 and sets charset of http response content type header utf-8. last instruct client (webbrowser) display , submit page form using utf-8.

  2. create filter following in dofilter() method:

    request.setcharacterencoding("utf-8"); 

    map on facesservlet follows:

    <filter-mapping>     <filter-name>nameofyourcharacterencodingfilter</filter-name>     <servlet-name>nameofyourfacesservlet</servlet-name> </filter-mapping> 

    this sets request encoding of jsf post requests utf-8.

this should fix unicode problem in jsf side. have never used pdfbox, since it's under covers using itext in turn should supporting unicode/utf-8, think part fine. let me know if still doesn't after doing above fixes.

see also:


Comments

Popular posts from this blog

android - Spacing between the stars of a rating bar? -

html - Instapaper-like algorithm -

c# - How to execute a particular part of code asynchronously in a class -