Page 1 of 1
GetAllText Problem
Posted: Mon Nov 08, 2010 11:17 pm
by dc3_dcfl
I am using the GETALLTEXT function contained in RVGETTEXTW to retrieve just the text of a document (from a DBRichviewEdit). In most all cases, this works without problems, but on a few documents, the returned string is nothing but the formatting, not the text.
I cannot find anything different about the actual documents, and I've also tried RVGETTEXT vs. RVGETTEXTW but in both cases, I get the same results.
Posted: Tue Nov 09, 2010 9:54 am
by Sergey Tkachenko
Please send me a sample where the function returns a wrong result.
Posted: Tue Nov 09, 2010 2:36 pm
by dc3_dcfl
Sergey,
I was prepping a document to send to you, (Cannot send you an original due to sensitive nature of the document).
Noticed that when I removed the graphic from one of the documents that was producing the error, it no longer produced the same error. I thought Getalltext only returned the text of a document removing all formatting and graphics.
How do I manually remove the graphics from the document? What I want is a string with nothing more than the text that I can pass through my regular expression parser.
Posted: Tue Nov 09, 2010 3:48 pm
by Sergey Tkachenko
Please send an example first.
Graphics in document must not affect the result.
Posted: Tue Nov 09, 2010 4:44 pm
by dc3_dcfl
I sent an example document to richview{at}gmail{dot}com
Thanks
Posted: Tue Nov 09, 2010 5:26 pm
by Sergey Tkachenko
Received.
The problem is not in GetAllText, the problem is in error that occurs when loading this document from DB record.
When loading a document, TDBRichView tries to load it as RVF. If failed, it tries to load as RTF. If failed, it tries to load as a plain text.
In your case, the record contains RVF document, but DBRichView fails to read it, so it load this document as a plain text. As a result, DBRichView displays all RVF codes, and they are returned by GetAllText.
Why RVF reading fails? The document contains image of TPNGObject class.
To load it, TPNGObject must be registered.
Call RegisterClass(TPNGObject) one time before the first reading from the database.
Posted: Tue Nov 09, 2010 6:00 pm
by dc3_dcfl
Thank you. I do that in my main application that loads the document, but not in the regular expression parser I'm now working on to read it.
I neglected to do it, now it all makes sense.
Thanks again.