GetAllText Problem

General TRichView support forum. Please post your questions here
Post Reply
dc3_dcfl
Posts: 33
Joined: Sat Jan 30, 2010 12:45 am

GetAllText Problem

Post by dc3_dcfl »

I am using the GETALLTEXT function contained in RVGETTEXTW to retrieve just the text of a document (from a DBRichviewEdit). In most all cases, this works without problems, but on a few documents, the returned string is nothing but the formatting, not the text.

I cannot find anything different about the actual documents, and I've also tried RVGETTEXT vs. RVGETTEXTW but in both cases, I get the same results.
Sergey Tkachenko
Site Admin
Posts: 17559
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

Please send me a sample where the function returns a wrong result.
dc3_dcfl
Posts: 33
Joined: Sat Jan 30, 2010 12:45 am

Post by dc3_dcfl »

Sergey,

I was prepping a document to send to you, (Cannot send you an original due to sensitive nature of the document).

Noticed that when I removed the graphic from one of the documents that was producing the error, it no longer produced the same error. I thought Getalltext only returned the text of a document removing all formatting and graphics.

How do I manually remove the graphics from the document? What I want is a string with nothing more than the text that I can pass through my regular expression parser.
Sergey Tkachenko
Site Admin
Posts: 17559
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

Please send an example first.
Graphics in document must not affect the result.
dc3_dcfl
Posts: 33
Joined: Sat Jan 30, 2010 12:45 am

Post by dc3_dcfl »

I sent an example document to richview{at}gmail{dot}com

Thanks
Sergey Tkachenko
Site Admin
Posts: 17559
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

Received.

The problem is not in GetAllText, the problem is in error that occurs when loading this document from DB record.
When loading a document, TDBRichView tries to load it as RVF. If failed, it tries to load as RTF. If failed, it tries to load as a plain text.

In your case, the record contains RVF document, but DBRichView fails to read it, so it load this document as a plain text. As a result, DBRichView displays all RVF codes, and they are returned by GetAllText.

Why RVF reading fails? The document contains image of TPNGObject class.
To load it, TPNGObject must be registered.
Call RegisterClass(TPNGObject) one time before the first reading from the database.
dc3_dcfl
Posts: 33
Joined: Sat Jan 30, 2010 12:45 am

Post by dc3_dcfl »

Thank you. I do that in my main application that loads the document, but not in the regular expression parser I'm now working on to read it.

I neglected to do it, now it all makes sense.

Thanks again.
Post Reply