chevy9294@monero.town to

Rust@programming.devEnglish · 2 months ago

XHTML 1.0 Transitional parser?

7

5

XHTML 1.0 Transitional parser?

chevy9294@monero.town to

Rust@programming.devEnglish · 2 months ago

7

So I’m trying to parse school’s website for some info. I’m trying to get some values using xpath. So I found a html 5 parser and it can’t properly parse the first line. Then I figure you it’s actually XHTML and not HTML. After quick Google search I found out XHTML can be properly parsed using any XML parser and so I found one and… It can’t parse the first line. So I ask LLama3.1 (like a real programmer) why I can’t parse the first line with any parser. It explained so nicely that I did not destroy my keyboard when I was told that this document is “XHTML 1.0 Transitional” and it’s a mix of HTML 4 and XHTML and can’t be parsed with HTML nor XML parser. I hate the guy that invented that so much…

So I can’t find a crate to parse XHTML 1.0 transitional? Or a crate to convert xhtml to something else? Any advice?

Chat

taladar@sh.itjust.works
link
fedilink
arrow-up
2·
2 months ago
Have you tried some tag soup parser? That should work as a last resort even if the ones building a tree structure don’t.

Rust@programming.dev

rust@programming.dev

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Welcome to the Rust community! This is a place to discuss about the Rust programming language.

Wormhole

[email protected]

Credits

The icon is a modified version of the official rust logo (changing the colors to a gradient and black background)

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

2 users / day
60 users / week
579 users / month
1.73K users / 6 months
1 local subscriber
5.94K subscribers
832 Posts
3.27K Comments
Modlog