添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I require to replace a HWPFDocument paragraph text of .doc file if it contains a particular text using java . It replaces the text. But the process writes the output text in a strange way. Please help me to rectify this issue. Code snippet used:

public static HWPFDocument processChange(HWPFDocument doc)
        Range range = doc.getRange();
        for (int i = 0; i < range.numParagraphs(); i++)
            Paragraph paragraph = range.getParagraph(i);
            if (paragraph.text().contains("Place Holder"))
                String text = paragraph.text();
                paragraph.replaceText(text, "*******");
    catch (Exception ex)
        ex.printStackTrace();
    return doc;

Input:

Place Holder 
Textvalue1
Textvalue2
Textvalue3

Output:

*******Textvalue1
Textvalue1
Textvalue2
Textvalue3

The HWPF library is not in a perfect state for changing / writing .doc files. (At least at the last time that I looked. Some time ago I developed a custom variant of HWPF for my client which - among many other things - provides correct replace and save operations, but that library is not publicly available.)

If you absolutely must use .doc files and Java you may get away by replacing with strings of exactly same length. For instance "12345" -> "abc__" (_ being spaces or whatever works for you). It might make sense to find the absolute location of the to be replaced string in the doc file (using HWPF) and then changing it in the doc file directly (without using HWPF).

Word file format is very complicated and "doing it right" is not a trivial task. Unless you are willing to spend many man months, it will also not be possible to fix part of the library so that just saving works. Many data structures must be handled very precisely and a single "slip up" lets Word crash on the generated output file.

Thanks ! for your valuable reply..Changing the string in the .doc file directly without using HWPF? How it would be possible? Could you please give a bit explanation on this? – Sherin Apr 30, 2015 at 17:20 You need to dive into HWPF source code. There are two levels of the data model: usermodel and model (both in package org.apache.poi.hwpf). When you have a text run in the "usermodel", you can look at how it references the data in the "model". Eventually you probably end up at class CHPBinTable. There look for things having FC in their name which are already very close to a file location. Then descend down to POIFS package which represents the underlying OLE2 data format. You may have to customize HWPF a little to make private classes/methods/fields accessible. – Rainer Schwarze Apr 30, 2015 at 20:11 I'm trying the code change as you suggested.I expect your valuable suggestions if I face any difficulty. Thanks for your suggestions. – Sherin May 2, 2015 at 5:50

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.