HTML Basics
What is HTML?
HTML stands for Hyper Text Markup Language.
'Hypertext' refers to any piece of text which can behave in a dynamic way, such as:
- Opening a web page
- Downloading a file to the user's computer
- Playing a video or audio clip
A 'markup language' is an artificial language that gives instructions for how text should be displayed. Using markup, plain text can be formatted with headings, paragraphs, tables, and much more.
All web pages are delivered to your web browser as HTML code (though there may be many other technologies involved, such as CSS, javaScript, flash and PHP, to name a few). You can view the HTML code of a web page by right-clicking in the page and choosing 'View Source'.
HTML code consists of plain, unformatted text containing markup tags such as <p> and <h1>. These tags define the structure of the page. The browser uses this structure to format the page on the screen in a way that the user can read and interact with.
An HTML file must be saved with an htm or html file extension to be viewed in a web browser.
Web authoring programs such as Dreamweaver can save a lot of work writing HTML, however an HTML file can be created using a simple text editor like Notepad. There are also lots of free HTML editors. HTML-kit is a good one, but if you google for 'free html editors' you can find plenty more.
HTML, the internet and the World Wide Web
The internet is the worldwide network of computers that are able to exchange data. The world wide web - the network of web pages that can be accessed through a web browser - is only one of the networks within the internet. Other networks include email, instant messaging and peer-to-peer networks.
HTML and the world wide web were invented in 1989-1991 by Tim Berners-Lee, working at CERN, the European particle physics laboratory that recently brought us the Large Hadron Collider.
HTML and XHTML
Many, if not most, web pages nowadays are built using XHTML. XHTML stands for eXtensible HyperText Markup Language. It's essentially a stricter version of HTML, which aims to make pages compatible with a wider range of technologies that are being used to access the web.
All tutorials on this site will be using XHTML.
There are some very small differences in syntax between HTML and XHTML, but the main difference is that, in theory, pages written in XHTML will not tolerate sloppy coding. I say 'in theory' because most web browsers are extremely tolerant, and will do their level best to render the most horrendously error-ridden code. Don't let this lull you into thinking that you can get away with any amount of mistakes; you will do much better at getting your pages working properly and looking consistent across different browsers if you write clean, valid code.
HTML tags
HTML tags are used to mark up the different elements of a web page, such as headings, paragraphs and links. All HTML tags are enclosed by the two characters < and >, called angle brackets.
Most tags are written in pairs: the first tag in the pair is called the start, or opening tag, the second tag is called the end, or closing tag. The content of the element (which can be text and/or other elements) goes between the opening and closing tags. This, for example, defines a paragraph:
<p>A paragraph</p>
Some HTML tags do not have a closing tag because they don't enclose any content. In this case, a line break is created using a single tag, which is closed by adding a space and a / before the closing >, thus:
<br />
(NB. this applies only to XHTML – single tags in HTML 4 close without the space or slash, thus: <br>)
HTML attributes
Attributes can be added to HTML tags to provide additional information for the element. For example, for the <img> tag, which defines an image, the src attribute tells the browser where the target image is located.
Attributes must be specified in the opening tag of an HTML element, and are written in name/value pairs, with the value enclosed in quotes, like so: name="value".
You can put as many attributes into an element as you like, provided that the attribute is allowed for that element. The attributes can be specified in any order.
For example:
<p align="center" title="My Paragraph">A paragraph</p>
Naming conventions
When writing html documents:
- all tags and attribute names must be written in lowercase, ie
<h1>rather than<H1>
Note:
attribute values can be written in uppercase - see the example above
Also, this rule applies only to XHTML - all files should be named in lowercase
Windows operating systems are case-insensitive - ie, files named index.html and InDEx.HTml are seen as identical. However, Unix operating systems - which are used to host most websites - are case-sensitive, so those two files would be seen as different, which can cause problems, such as broken links, when the files are transferred from your local machine to the remote website. - file names should only contain letters, numbers, and hyphens or underscores – spaces are not allowed. This also applies to any other files included in the page, such as image files.
Again, this prevents problems that can result from the inconsistensies of how different operating systems handle funny characters.
Useful resources
Before going any further, you should bookmark these pages (I assume you already have this site bookmarked):
- w3schools: A huge collection of free tutorials covering all areas of web authoring, put together by the World Wide Web Consortium, the international body responsible for web standards.
- HTML tag reference: On the same site, a complete reference to HTML; what each tag is for, where in the page it should go and what attributes it can have.