Author Topic: How does a web browser work? (Read 3335 times)

Waccoon · « **on:** January 21, 2007, 01:58:45 AM »

In general, how a browser interprets code varies. The whole point of HTML (at least, as it is applied today) is to simply describe content, rather than to present it, so techniques may be wildly different. Almost all browsers these days dynamicly render things over and over as more information is downloaded. Further refreshes may occur if Javascript, DOM, or other dynamic HTML techniques are used.

All browsers have to parse the HTML into a tree, which is constantly expanded and updated as code is parsed. The code is "upscaled" into a more modern or otherwise fundamental languages, such as SGML or XML. Hardly any web browsers use HTML properties internally. Various proprietary techniques are used to fix broken code. I believe most tag attributes are parsed right-to-left.

Rendering depends on the engine. They usually start with text, then tables and sections, CSS (if supported), and then objects, such as images. Everything has a priority set to it -- some browsers respect the priorities set by the W3C, some don't. Basicly, the stuff that's easiest to draw will be rendered first. This improves resposiveness, at the expense of having to draw the page more often.

Older browsers used to start drawing stuff right away in chunks, usually separated with

,

, and header tags. That allowed people to start reading (sometimes ugly) content while they were waiting for the rest of the page to load.

These days, such rendering techniques cause all kinds of visual havoc, such as the infamous Flash of Unstyled Content, and widths and other layout properties are rarely described in the HTML source, so browsers tend to download more content before rendering a page.

Waccoon · « **Reply #1 on:** January 22, 2007, 05:16:01 AM »

Only if the person who writes the code doesn't do it properly. You know, graceful degredation, and all that.

The problem is that HTML isn't a standard. It was supposed to be an easy language based on SGML, but broke a huge number of SGML rules, such as the lovely concept that paragraph tags did not have to be closed, which is completely illegal in SGML. HTML was so severely crippled early on, but to appease the masses of people who already used it, the standard was changed again and again in an attempt to "fix" it. Standards aren't supposed to change in ways that break backwards compatibility. Major syntax changes tend to do that.

Parsing and rendering HTML is actually very easy. Basic, embedded web browsers may be as small as 300K. But, due to the inconsitency (not all of which is Netscape and Microsoft's fault), web browsers end up being these 10-20MB behemoths. Today, for example, it is illegal in HTML 4.01 to terminate META tags. That's the complete opposite of what's supposed to happen.

That's the problem with a lot of languages. They get thrown together in a quick-and-dirty fashion, and if they start to become popular, people would rather maintain compatibility with horribly broken code, than make a proper standard. That results in a lot of bloat in the applications that have to parse these languages.

Author Topic: How does a web browser work? (Read 3335 times)

Waccoon

Re: How does a web browser work?

Waccoon

Re: How does a web browser work?