Return to Deep Thoughts

An Introduction to MIME

Introduction

So, you ponder, what is MIME? MIME is an abbreviation for "Multimedia Internet Mail Extensions" and describes how messages are sent on the Internet. Whether its E-Mail or the World Wide Web, MIME is used to keep things in order! Like its human counterpart: MIME is silent, conveys information, and often provides nothing more than entertainment value.

A 7-bit World

Today's computer technology thinks in 8-bit bytes. When information is transmitted these days, its usually done so in an 8-bit fashion. However, there are instances when a transport medium will only handle 7-bits. Furthermore, when it comes to E-Mail, there must be some consideration for systems that are based upon IBM's EBCDIC (Extended Binary Coded-Decimal Interchange Code), rather than the ASCII (American Standard Code for Information Interchange) code that we are most familiar with. MIME makes sure that messages meet these criteria!

MIME Headers

MIME is best thought of as nothing more than a simple message that describes the contents that follow. In the World Wide Web, the first thing a server will do is send out a MIME header. Using the WWW as an example, the MIME header will look exactly like this: "Content-Type: text/html" The server is telling the client that what follows is a text message, comprised of the HTML language.The browser then knows to display the message in accordance with HTML. The server might have sent a MIME header of: "Content-Type: text/plain", in which case the browser would render a fixed-font display of the message or document. If the server sent a "Content-Type: image/jpeg", the browser would expect to render a JPEG (Joint Photographics Experts Group) image.

In the case of E-Mail, the same MIME header as discussed above is sent, but usually another one describing the type of encoding used is also sent. This is the "Content-Transfer-Encoding" header. Often, you see headers of: "Content-Transfer-Encoding: 7BIT", "Content-Transfer-Encoding: 8BIT", "Content-Transfer-Encoding: quoted-printable", or "Content-Transfer-Encoding: base64". Another common MIME header that is seen in E-Mail (in fact, its mandatory) is the "MIME-Version" header, normally: "MIME-Version: 1.0".

MIME Encoding Techniques

There are many ways to encode a document for transmission, but MIME standardizes two such mechanisms in RFC 2045. There are known as "Quoted-Printable" and "Base64". Other possible encoding types are UUENCODE, the Macintosh BinHex 4.0 (RFC 1740), and the Base85 encoding specified in Level 2 Postscript. However, these schemes may have compatibility problems with 7-Bit gateways and EBCDIC systems, so the use of these encoding schemes is not recommended.

Quoted-Printable Encoding

As you are probably aware, an 8-bit byte yields 256 possible variations (called characters). Of these 256 characters, 128 are "printable" characters of the US-ASCII character set. In this scheme, the encoding is such that data is unlikely to be modified by the transport facility.

Lines are transmitted in lengths of no more than 76 characters. Carriage Returns in the data are translated to a "soft line-break" character (=). Decimal characters 128-255 must be "Quoted"; that is to say that they are represented in hexadecimal form. For example: the code "=FF" would represent character number 255. Generally speaking, any character may be quoted.

The Quoted-Printable scheme is efficient and possesses a high degree of readibility even if the encoded version is viewed. Unfortunately, this type of encoding may have trouble passing through EBCDIC-based E-Mail gateways. There is a hight degree of EBCDIC compatibility that can be achieved by quoting the: !"#@[|]^`{\} characters.

Base64 Encoding

The most common encoding scheme used is known as Base64, representing its use of 64 printable characters in its alphabet. Actually, there's 65 characters because the = sign is used for "padding". Here's the character set used:

     Value Encoding  Value Encoding  Value Encoding  Value Encoding
         0 A            17 R            34 i            51 z
         1 B            18 S            35 j            52 0
         2 C            19 T            36 k            53 1
         3 D            20 U            37 l            54 2
         4 E            21 V            38 m            55 3
         5 F            22 W            39 n            56 4
         6 G            23 X            40 o            57 5
         7 H            24 Y            41 p            58 6
         8 I            25 Z            42 q            59 7
         9 J            26 a            43 r            60 8
        10 K            27 b            44 s            61 9
        11 L            28 c            45 t            62 +
        12 M            29 d            46 u            63 /
        13 N            30 e            47 v
        14 O            31 f            48 w         (pad) =
        15 P            32 g            49 x
        16 Q            33 h            50 y

Basically, groups of three 8-bit bytes (24 bits) are encoded into four 6-bit groupings (24 bits). Again, no more than 76 characters are allowed per line. Padding is accomplished through the use of the = character.

The . CR LF and - characters are not used. This is particularly useful for SMTP Mail transport.

Base64 is fully compliant with EBCDIC systems, as well as 7-bit transport mediums. The downside is that Base64 encoded files occupy (consistently) 33% more space than the original binary source. For example, a source file that is 300 KBytes in size would be 400 KBytes after Base64 coding.

More on MIME Types

The "Content-Type" header indicates a type/subtype of the data to follow. Early on, anything went. For example, the MIME type of "application/msword" was issued to define Microsoft Word documents. Now, formal vendor applications will have the letter vnd prepended to the MIME subtype; hence MIME types of "application/vnd.ms-excel" or "application/vnd.lotus-1-2-3".

Some of the MIME subtypes may start with a prefix of x, like "audio/x-pm-realaudio" or "image/x-MS-bmp". The X means that it is experimental.

MIME Types and E-Mail Attachments

MIME types are indicated by the Originator of an E-Mail. Unfortunately, most PC packages have no facility for determining what the proper MIME type should be for any particular file attachment. For that reason, most PC E-Mail packages apply the MIME type of "application/octet-stream" to most E-Mail attachments.

The Recipient of the E-Mail receives the message and will decode the attachment per the "Content-Transfer-Encoding" instruction, usually Base64. Many E-Mail packages, such as Netscape, MS Internet Mail, or Eudora will then hyperlink the attachment. As such, it is up to the Recipient's Operating System to properly interpret the file type (usually by extension) and open the file with the appropriate application. In theory, the E-Mail program should know what file application to run based upon the MIME type, not filename extension.

In the UNIX world, there is a file in each user's home directory, called ".mailcap", that is used by UNIX multimedia E-Mail programs to determine the correct MIME type and application.

MIME Types and the WWW

A WWW Server is remarkably dumb. For the most part, all it does is send out files all day long, and occasionally run a CGI program or some such. It doesn't have the "smarts" to look at a file and figure out what it is. Instead, it relies on a file called mime.types to determine a MIME type based upon the filename's extension.

Clicking on this hyperlink will generate a listing of the mime.types file in use here at TBI (as of 2/7/98). Periodically, we check for newly registered MIME type additions and update our server mime.types file (not this one). Notice that the hyperlink has a .txt extension. This will ensure that the WWW server generates a Content-Type: text/plain header so that your browser will render it in text format.

Even More MIME Stuff!

Keeping track of MIME types can be a fascinating thing, often indicating future things to come. But if you're really into MIME, you'll want to start off reading these materials:

Hit Counter

| Home | History | Multiplex | Xmission | Networking | Switching | Modulation |

mime.htm, ©1998 All rights reserved
Tampa Bay Interactive, Inc.
Last Revised on: Monday, 25-Oct-2004 19:46:42 EDT