Redirected from Community/SgmlReader
Was this page helpful?

SGMLReader - Convert any HTML to valid XML

    Table of contents
    This page's content is now located at https://github.com/MindTouch/SGMLReader.
    Was this page helpful?
    Tag page
    Viewing 1 of 1 comments: view all
    The sgml parser does not detect and properly correct for end-of-line characters.
    When parsing an OFX message, one of the servers sent a formatted response with the nested sections indented with spaces.
    The parser (reasonably) treats end-of-line characters as white-space and even though they are appended to the text-field they must be filtered out.
    However, once the parser moves onto the next line it starts consuming the indenting spaces and appends these to the text-field of the item above it!

    e.g. instead of the desired resultant XML of <VER>1</VER> I get <VER>1 </VER>
    The extra spaces are the indentation space of the following line.
    This is an error in the way the text-fields are parsed and would require a bit of work to fix "properly".
    A quick hack is to trim the string.
    In SgmlReader.cs, near the end of the ParseText() function, on or near line 2096 change the code
    string value = this.m_sb.ToString();
    to
    string value = this.m_sb.ToString().Trim(); edited 19:30, 26 Jan 2012
    Posted 19:29, 26 Jan 2012
    Viewing 1 of 1 comments: view all
    You must login to post a comment.

    Copyright © 2011 MindTouch, Inc. Powered by