Using C# to convert incorrect html string to real html

My original concern is that I am trying to serialize a string including html tags to an XML factor.

Serializing HTML to XML: I did not do well in describing the Serializable training class to appropriately serialize with XmlSerialze, so I decided that, making use of CDATA segments may be actually the far better way. This is nonetheless certainly not accurately deserialized through the target resource (that I possess no effect on).

The strand looks e.g. as above, yet is certainly not fully correct html (no <p> tags, no <br> tags). Right now I want to change the newlines by a p or br tag. I have looked here and also made use of the suggested service:

That is actually web server side markup that receives bestowed HTML in the future. You require to programmatically generate instances of the managements instead of doing it by means of strand. Conversely, make use of genuine HTML profit such as table factor as an alternative of asp: Table command.

What you are actually attempting to carry out is execute a “tag soup parser“, which takes message that might or even might certainly not be HTML as input as well as changes that in to a valid DOM, that a HTML parser can deal with.

The XPath selector above are going to decide on all a components that have an href feature that are youngster nodules of a div aspect with a lesson of ‘photoBox’. You can easily after that iterate this selection as well as obtain the href quality market value of each component.

Intended would be to have a result like the following (line breathers are actually simply for far better legibility as well as don’t matter listed here).

I have no influence on deserialization, the XML output is actually examined through I resource I possess no impact on, and it needs to be actually UTF-8 encrypted.

Nonetheless, this does certainly not in every scenarios create authentic html. In the example over, it would make <br />s in between the <li> tags or even cause <ul> tags within <p> tags – which is each not enabled.

Somehow my code isn’t being shown like an usual desk it is merely outputting a block of message as an alternative.

HtmlAgilityPack right now assists Linq (because 1.4), therefore just receiving a specific attribute worth could be actually done a lot easier (imo) like this:.

You don’t desire to change this wheel, the majority of certainly not along with simple string substitutes. Find Exactly how to parse negative html? for some tips.

The complication is that you had feature and also factor selectors mixed up. Coming from you question its own unclear whether you truly meant to inquire for an assortment.

Or even you can easily only inscribe the input HTML as if it doesn’t hamper the XML that you’re attempting to place it in, like a CDATA area or even base64-encoding the input will likewise be enough. Do not use “company encoding”, as your XML parser is actually going to fuss regarding HTML facilities that may not be XML entities.

I’m trying to find a way to make use of a consistent (cord) within HTML <p> tags.

Leave a Reply

Close Menu