Chapter 1 Introduction to Error-Correcting Codes
1. Motivation
In today's technological society digital communication is present in every aspect of our lives. Satellite data transmissions, network transmissions, computer file transfers, radio communications, and cellular communications transfer data information through a channel that is prone to error. In each case information or data is transmitted using a finite source alphabet encoded with a finite code alphabet. For example, in a black-white satellite image transmission the source alphabet is the set of 256 possible gray scale values. The code alphabet is the radio signals that are transmitted from the satellite to the receiving station through space, the channel of transmission.
2. Brief Historical Introduction
The idea of Error Correcting Codes came with the onslaught of computer technology. In the late 1930s Bell Telephone Laboratories built one of the first mechanical relay computers. This computer is unlike anything currently in use. However, the mechanical relay computer while executing a program was prone to errors like todays computers. In fact, in 1947 Hamming(1) conducted calculations on the mechanical relay computer at Bell Telephone Laboratories and found himself constantly re-running his programs due to computer halts. In an interview Hamming commented:
"Two weekends in a row I came in and found that all my stuff had been dumped and nothing was done...And so I said, "Damn it, if the machine can detect an error, why can't it locate the position of the error and correct it?"[12]
The relay computers operated with self-checking codes only, called two-out-of-six-holes. The machine would detect the error then halt the current job and move to the next. The coding technique used is similar to a repetition code.
"...the input was entered in the machine via a punched paper tape, seven-eights of an inch wide, which had up to six holes per row. Each row was read as a unit. If the sensing relays expected a two-out-of-six code, they would prevent further computation if more or less than two holes appeared in a given row..."[12]
With the two-out-of-six-holes coding technique the relay computer halted two or three times a day. The need for a better code was apparent once the computer became 103 times faster--i.e. the computer would stop two or three hundred times a day. Hamming knew that if a repetition code was to be used the Relay Computer would grow in size as the computer became faster. Therefore a single error-correcting code with minimum redundancy was needed. This code was to be called the (7,4) Hamming Code.
Another key player in the birth of Error-Correcting Codes is Marcel J. E. Golay. Golay was an engineer at Signal Corps Engineering Laboratories at Fort Monmouth. He became interested in Error-Correcting Codes after reading a paper published by Shannon(2) on the (7,4) Hamming Code. This publication[11] sparked Golay's interest:
"I had been thinking about information theory for quite a while, when I was involved with radar. I said, 'What do we want out of radar? After all, we want information.' And I began to acquire the idea that a bit of information cost about one kTloge2. But I didn't come to--I didn't think of--the idea of Hamming, of coding or like that. But when I read the [Shannon] paper, it was the key because I was ripe for these progresses. I had already thought about it."[12]
Shannon's paper gave Golay an entirely new perspective: a geometric perspective of information and coding theory. Golay set out to find perfect single-error-correcting-codes. Codes that have every word associated to a codeword. Therefore these codes correct single errors without any "extra" redundancy. Golay started out to prove first that a perfect code exists. In 1958 Golay stated[4] a necessary condition for the existence of a perfect code:
n = ((pm)k - 1)/(pm - 1)
of length n using pm symbols for some integer k.
Chapter 2 The Original Construction: Golay Code
1. Historical Note
It was not till another five years later that Golay presented a construction of his code in the 1949 paper[2] titled "Notes on Digital Coding." In 1954 Golay published "Binary Coding."[3] In this paper he gave a geometric explanation of the construction of the Golay (23,12) code.
2. "Five Lines and Ten Points..."
"Consider five lines in the plane, no two of which are parallel and no three of which are concurrent."[3] We will label these lines as A, B, C, D, and E with intersections labeled according to which lines are intersecting. We end up with the following diagram:


To construct columns 7 through 12, take a look at the intersections formed by neighboring symbols in a 5-cycle: ABEDC. In column 7 we place a 0 for intersections AB, BE, ED, DE, and CA with 1s elsewhere. "There are 4! cyclical permutation of the five lines...,"[3] pick 11 permutations that differ from our original by an even number of interchanges of symbols. Within this group of 12 permutations only six pairs differ in their order. For example, the pair ABEDC and ACDEB are equivalent since they differ by order only. Hence we have 6 unique 5-cycles: (ABEDC), (ABCED), (ABDCE), (ACEBD), (ACBDE), and (ADCBE) where each permutation describes columns 7, 8, 9, 10, 11, and 12 respectively.


Chapter 3 The Main Coding Idea
1. Introduction by Example
Coding is needed when
data is transmitted and, more
specifically, when the
transmission is subject to error in
the channel. For instance, NASA
scientists needed to determine
how to reliably transmit data
through space, since anything
transmitted through a channel that is millions of miles long is bound to accumulate
errors along the way! One possible solution to this problem is to repeat the
transmission several times so as to increase the probability of receiving the correct
message. This is called a repetition code as described to the right. This type of code
is reliable, but has a disadvantage; it is time consuming.
In 1965 Mariner 4 was the first satellite to take pictures of Mars[6]. Each picture was a 200x200 pixel image with a total of 240,000 bits of information, since there are six bits per pixel. This meant 240,000 bits of information needed to be transmitted approximately 78 million miles! Technology in 1965 allowed data to be transmitted at a rate of only 8 1/3 bits per second. Thus, it took eight hours to send each picture. If any repetition was to be incorporated, each picture would have taken an implausible amount of time.
2. Main Coding Process
It is necessary to become familiar with several keywords and basic concepts to understand error-correcting codes. These include: coding, the Main Coding Idea, block codes, and instantaneous codes.
Coding is defined by two finite sets, A and B, where the coding scheme is an onto function from A onto B. The sets A and B are commonly referred to as the source alphabet and the code alphabet respectively.
The Main Coding
Idea, as illustrated to the
right, describes what
happens to data using
coding techniques in
preparation for data
transmission and decoding
after reception. In every
instance data is first
encoded(3) into a codeword(4). Then the encoded data or codeword is transmitted
through a channel where error is added to the codeword. This models transmission
through a noisy channel. The altered codeword is called a word. Finally, the word
is received, corrected to a codeword, which is in turn decoded for the recovery of
the original information.
There are two main types of coding schemes: block codes and instantaneous codes. Block codes are codings that have codewords of a fixed length, n. For example, the hexadecimal code and ASCII code are both block codes. Instantaneous codes are codings where no codeword is a prefix of any other codeword. For example, Morse code is an instantaneous code since a space is only used at the end of each codeword. We shall examine block codes.
The time problem we encountered with repetition codes gives us motivation to look for better coding methods. In 1947, Hamming noted that the two main requirements of devising a new coding method are to have an error detecting scheme and an error correcting scheme. In the repetition code we have error detection and correction capabilities but only at the expense of efficiency.
Another error detection scheme is called the parity check or check bit. A parity check is the process of counting the number of ones in a binary sequence. The parity check bit is a bit that is added to make the sequence even(5). For example, if we have x1,x2,x3,P , where x1, x2, and x3 are the information symbols and P is the parity check bit:
P is 0, if an even number of x's are ones.
P is 1, if an odd number of x's are ones.
For example:
So, P is 1 if x1 x2 x3=111
So, P is 0 if x1 x2 x3=101
Similarly, we can define a parity check with the following equation modulo 2:
x1 + x2 + x3 P
Solving for P,
x1 + x2 + x3 + P 0
With a parity check we can detect an error by determining whether the equation fails. Note that we are assuming that only one error is made. If more than one error occurs there is no guarantee that an accurate error detection can be made.
Chapter 4 The Construction: Hamming Code
Now let us consider a system of equations having more than one parity check to see if some type of error correction is possible. For example,
x1 + x2 + x3 = P1
x2 + x3 + x4 = P2
x1 + + x3 + x4 = P3
Again, xn's are the information symbols and Pn's are the system of parity checks which satisfy each equation. Arranging the equations so that each is equal to zero we get the following:
x1 + x2 + x3 + + P1 = 0
x2 + x3 + x4 + P2 = 0
x1 + + x3 + x4 + + P3 = 0
Consider the following representation of the above equations:

Each equation will belong to its respective row and each xn and Pn will belong to its respective position number indicated by the dots. Hence positions 1 through 4 are the information symbols and positions 5 through 7 are the parity checks. Looking at these equations as a system of parity checks, the following observations can be made:
1. The 1st row will detect one error at positions 1,2,3,5.
2. The 2nd row will detect one error at positions 2,3,4,6.
3. The 3rd row will detect one error at positions 1,3,4,7.
Due to the interrelationships of this system we will explore the error correction
possibilities. Let us look at two scenarios and determine whether error correction is
possible when a row fails the parity check.
In the first scenario, diagram 4.1, assume
the first row fails(6) the parity check, but the next
two consecutive rows do not. Since we are
assuming that only one error can occur at any one
time we conclude that there exists exactly one
error in the first row. Moreover, the error must
exist in either position 1,2,3, or 5. By looking at
the dots as a system we can make the following
observations. The error could not have occurred in positions 2,3, or 4, since the
second row does not fail. The error cannot be in position 1, since the third row does
not fail. Thus, the error must be in position five, as seen in diagram 4.2. Once we
have located the error we can easily correct it by adding one modulo two.
In the second scenario, diagram
4.3, the first and second rows each contain
an
error since they fail the parity check.
However, the third row does not fail the
check. So, the error must lie in position 2
or 3, since the error could not occur in
positions 5 and 6 or 1 and 4. Clearly if the
error was in position 3, then the third row
would have failed. Thus, in diagram 4.4, the
error must lie in position 2.
These two examples suggest that this method functions as a tool for the detection and correction of errors. To prove this consider the observation that in the above scenarios there is a one to one correspondence between the positions where the parity failed and the positions of the "dots." For example:

The Xs mark the positions in which an error occurred. In addition, the Xs mark the rows which failed the parity check. The possible error configurations--i.e., the possible combinations of the Xs boxes--match with the "error position column"(7) as a bijection. Since there are only seven error configurations, by inspection we prove the bijection. See diagram 4.5 .
This efficient system of parity checks is part of the (7,4) Hamming code.
Instead of only being able to detect an error, a
Hamming code checking scheme, as described
above, is an intricate structure of interrelationships
that allows the detection of the position of the error
and its correction. The diagram to the right
illustrates the interrelationships of our error
correcting scheme where we cover the bases.(8) The numbers represent positions and
the edges represent addition between the two elements. The three triangles
represent the three Pn equations known as the Hamming equations,
x1 + x2 + x3 + = P5
x2 + x3 + x4 = P6
x1 + + x3 + x4 = P7
where xn=Pn for n=5,6,7
An important feature of this code is that it can be represented as a matrix--we will
later discuss this possibility. By replacing the dots
with ones and the blanks with zeros we obtain the
matrix to the right. In diagram 4.6 extra check bits
are added to the information symbols, forming what
is called the parity check matrix or sometimes known
as the check matrix. Note that the dashed vertical
line has no meaning except for illustrating the
boundary of the information symbols and the check
bits.
So far we have demonstrated a natural approach in creating a parity check

The syndrome describes the rows that fail the parity check. Each word we receive must satisfy the set of equations,
x1 + x2 + x3 + = P5
x2 + x3 + x4 = P6
x1 + + x3 + x4= P7
where xn=Pn for n=5,6,7. If the syndrome vector is not the zero vector(10) then the rows in which a one is present are the rows in which the equation fails. The syndrome can be described by the following equations:
x1 + x2 + x3 + + P1 = s1
x2 + x3 + x4 + P2 = s2
x1 + + x3 + x4 + + P3 = s3
where sn is the syndrome vector. For example, let us say we received the word:




Chapter 5 A Closer Scrutiny: Hamming Code
1. Linear Algebra Perspective
From a linear algebra perspective, what is the structure of the (7.4) Hamming Code? We take the set (Z2)n of all ordered n-tuples over Z2, thus forming a vector space. This n-dimensional vector space has scalar values zero and one. Any subset C(12) of this vector space forms a code. But, the code is special. It has the unique property that the sum of any two codewords is another codeword. Thus we get a binary linear code C if and only if it satisfies the property that the sum of any two codewords is a codeword which is a k-dimensional subspace of the n-dimensional vector space. Note that not every code C is linear, and that we have many more non-linear codes than linear ones. There is nothing wrong with non-linear codes, in fact at times non-linear codes are much more efficient, however, we will soon see why we sacrifice efficiency to use linear codes. We have a linear vector subspace with a basis. To describe the subspace we need only k codewords, since linear combinations of these k codewords span the k-dimensional subspace. This is the motivation for working with linear codes instead of non-linear codes. With a linear code we need only list k codewords to determine the rest of the codewords whereas with a non-linear code we need to list all the codewords, that is 2k codewords. This hints at the possibility of a generator matrix existing which is exactly the (7,4) Hamming Code generator matrix: a subspace of (Z2)n over Z2. The subset C, a k-dimensional subspace of an n-dimensional vector space, is called a binary linear code or an (n,k) code. We have now seen the origins of the 7 and the 4 in the (7,4) Hamming Code.
2. Properties of the Hamming Code
The Hamming code we have described is only one of the many Hamming codes in existence. Hamming codes have the following properties,
Length of codewords: n = 2m - 1
Information symbols: k = 2m - m - 1
Minimum distance: d = 3
, where m is a natural number(13): These codes detect and correct a single error. The minimum distance is the smallest difference between any two codewords. For example, if we take any two codewords in the (7,4) Hamming Code, they differ in at least three positions.
In the example above, we randomly pick two codewords and find that the distance
is equal to 4. In the next example, again we randomly pick two codewords and find
that the distance is equal to 3. In fact, three is the minimum distance. Note that the
first codeword is of weight(14) 4 and the second of weight 3. There is no set "formula"
for determining the minimum distance of a code. Since there are only 24 codewords,
by inspection we can determine that the minimum distance is 3 by checking them all.
There is another set of Hamming Codes appropriately called the Extended
Hamming Codes. The Extended Hamming Codes have three Hamming parameters:
Length of codewords: n = 2m
Information symbols: k = 2m - m - 1
Minimum distance: d = 3
The only difference between the Hamming Codes and the Extended Hamming Codes is that the Extended Hamming Codes have an overall parity check. As before, the overall parity check is added to the end of each codeword.
Note that the Extended Hamming Codes are still linear codes, since adding a one or zero at the end of each codeword is a linear process. These codes not only correct one error but detect two errors. In the diagram below we compare The Hamming Code and the Extended Hamming Code.

Before we venture off to look for a code, let us pay more attention to the special properties of the (7,4) Hamming Code(15). The (7,4) Hamming Code is a perfect code because every possible word is associated with a unique codeword and no word is without a codeword. This property results in having a very efficient code. This is how we get the most error correcting capability. The following is a necessary condition for a (n,k) code with minimum distance d=2*t+1 to be perfect.[10]

Now, we could randomly pick some large vector space and look at its subsets that are approximately half the dimension of the original vector space. Although our efforts to find codes would not be in vain, this is nonetheless a very brute force method in approaching the problem, especially if we are in search of particular linear codes. In a more intelligent approach, using the property that the sum of two codewords is another codeword, one would suspect some symmetry involved with the binary linear codes. Let us examine the (7,4) Hamming code for symmetries.
The first obvious symmetry is realized once we pay close attention to the ones and zeros of the codewords. There is a paring of codewords. From every codeword we can get another codeword just by replacing the ones by zeros and the zeros by ones.

Another, yet not so obvious, symmetry is due to the cyclic property of the codewords. If we arrange two columns such that the first column is composed of all the codewords of weight three we begin to demonstrate the cyclic symmetry of the code.

We can form another (7,4) Hamming code, called K, by reversing the order of the bits in the codewords from H. The immediate structures of the codes are congruent to each other yet the codes are disjoint except for 0000000 and 1111111.

Proof: We want to show that no codeword of weight three lies in both H and K, since the other codewords are obtained from these by taking complements--except for 0000000 and 1111111. We shift the codeword cyclically so that it starts with two adjacent ones. If it lies in H the result is 1101000 and if it lies in K the result is 1100010. So it cannot lie in both H and K.[10]
We can also do this for the (8,4) Extended Hamming code:

Chapter 6 The Construction: Golay Code
Using all of the observations we made above we are now ready to make an educated guess on how to construct a code that corrects more than one error. From the discussion on perfect codes we can determine that the next perfect code, if it exists, would be the (23,12) code. By the Extended (8,4) Hamming code we will get the (24,12) code. Stripping the overall parity check on the (24,12) code would result in the (23,12) code. We define extension by augmenting n (8,4) Hamming code codewords together forming an (8*n,4*n) code. It is conceivable that this augmenting method will be successful in forming a binary linear code since we are dealing with linear codes.
It requires a bit of ingenuity to construct the correct sequence in adding extra codewords to form the (24,12) code. The (24,12) code consists of all the words of length 24 in the following form:
a + x, b + x, a + b + x,
where a and b are codewords of H' and x is a codeword of K'.[10] There are 212 possible combinations using the above form, thus there are 212 codewords. This is known as the (24,12) Golay code.
Now that we have this definition of the (24,12) Golay code we would like to form a generator matrix. Since H' and K' are both (8,4) Hamming codes with generator matrices that generate all its codewords, it is conceivable that we can use these generator matrices to determine the generator matrix for the (24,12) Golay code.

We then substitute the columns of GH' for a and b, and the columns of GK' for x in the form of
a + x, b + x, a + b + x,
for each a, b, and x. Thus we get the generator matrix for the (24,12) Golay code.
The goal now is to find the minimum distance to show indeed we have a code that
corrects three errors. This is not an easy task. If we were to find the minimum
distance by inspection, we would have to inspect 212 codewords! As a result, we will
only outline the proof. First we need to prove that the weights of the codewords are
divisible by 4. Then we show that there are no codewords of weight 4 and that there
are codewords of weight 8. Now all that is left is to prove that there exists no
codeword of weight 4. This is done by a proof-by-contradiction. First we assume
a codeword of weight 4. Then we show that every non-zero codeword in H' has
weight greater than or equal to 4 so that in the augmentation process there must be
exactly two codewords that are the zero vector. If this is so, we obtain weights of
0 or 8; this contradicts the first assumption of having codewords of weight 4. This
concludes that the (24,12) Golay code has minimum distance 8.[10]
Now we are left to obtain the (23,12) Golay code from the (24,12) Golay code. This is accomplished by stripping the overall check bits, that is removing any column of the generator matrix in the (24,12) Golay code. So the minimum distance is 7 since stripping a bit will make the codewords have weight 7 at best. Thus we get the (23,12) Golay code(16) with minimum distance 7 that corrects three errors!
Chapter 7 Polynomial Cyclic Codes
1. Historical Note
The constructions we have completed until now strongly suggest the existence of other constructions using finite fields. Golay himself did not know about finite fields: "If Golay knew about finite fields he could have made many more important contributions."[12] It was John Cocke who first introduced constructions of error-correcting codes using finite fields. In 1959 he published an article in the I.R.E.(17) called, "Lossless Symbol Coding with Non-primes."[1]
"There exists a straight-forward method for constructing the coding matrix for realizing all the one-error correction codes whose existence was shown by Zaremba(18). The ability to do this depends upon the fact that there exists finite fields of order pn, where p is a prime, namely the Galois fields. The method is completely constructive since in order to construct the Galois field GF(pn), it is only necessary to find a polynomial of degree n which is irreducible over the field of integers mod p."[1]
With this Cocke set the foundations of modern error-correcting codes. As noted by Cocke, we will be examining a ring of polynomials modulo an irreducible polynomial g(x) over the field Z2 . The polynomial g(x) will be a factor of (xn-1) that is a monic irreducible polynomial of least degree. We will examine a set of codes called Cyclic codes that exclusively use finite fields.
2. Introduction Cyclic Codes
Cyclic codes are an important class of codes. They posses many algebraic properties and they can be easily implemented in technology by means of shift registers(19). To understand Cyclic codes better, let us look at a few key concepts. The word cyclic inadvertently describes the general property of the code. Any codeword can be created from another codeword by shifting the code cyclically. For example, as in our construction of the Golay Code we looked at a cyclic property of the Hamming code: c1 = ( 1 0 0 0 1 0 0 1), with a right shift, c2 = ( 1 1 0 0 0 1 0 0). c2 is another codeword. We can verify this by multiplying c2 by the parity check matrix and examining the syndrome. In addition, the simple binary code {000, 101, 011, 110} is also a cyclic code since any cyclic shift results in the other codewords in the binary code.
To form this family of cyclic codes will we look at code polynomials that describe cyclic codes in a practical fashion. Code polynomials are codes that are formed by polynomials. Our notion of polynomials is the same here, except we will restrict the coefficients to be 0 or 1.
f(x) = a0 + a1x + a2x2 + ... + anxn
Due to this restriction multiplication and addition are carried out modulo 2.
f(x) = a0 + a1x + a2x2 + ... + anxn
k(x) = k0 + k1x + k2x2 + ... + knxn
f(x) + k(x) = (a0 + k0 ) + (a1 + k1 )x + (a2 + k2 )x2 + ... + (an + kn )xn
f(x) = a0 + a1x + a2x2 + ... + anxn
k(x) = k0 + k1x + k2x2 + ... + knxn
f(x)*k(x) = r0 + r1x + r2x2 + ... + rn+mxn+m
where,


From algebra, this adding of structure is nothing more than taking the ring of polynomials modulo some polynomial. We will denote this as F[x]/f(x) such that degree of k(x) F[x] is less than the degree of f(x). How do we mod out polynomials and get a field? First we define the division algorithm for polynomials. Given a(x), b(x) F[x] , b(x) 0, there exist unique polynomials q(x) and r(x) such that a(x) = q(x)*b(x) + r(x) and the degree of r(x) < b(x). For example, consider F[x]. We can divide x3 + x + 1 by x2 + x + 1 :

The following is an example where this structure forms a field: look at addition and multiplication tables of F[x]/(x2 + x +1):
+ | 0 | 1 | x | x+1 | * | 0 | 1 | x | x+1 |
------------------------------------------------------------------------------------------
0 | 0 | 1 | x | x+1 | 0 | 0 | 0 | 0 | 0 |
1 | 1 | 0 | x+1 | x | 1 | 0 | 1 | x | x+1 |
x | x | x+1 | 0 | 1 | x | 0 | x | x+1 | 1 |
x+1 | x+1 | x | 1 | 0 | x+1 | 0 | x+1 | 1 | x |
Now consider the addition and multiplication tables for F[x]/(x2 + 1) :
+ | 0 | 1 | x | x+1 | * | 0 | 1 | x | x+1 |
------------------------------------------------------------------------------------------
0 | 0 | 1 | x | x+1 | 0 | 0 | 0 | 0 | 0 |
1 | 1 | 0 | x+1 | x | 1 | 0 | 1 | x | x+1 |
x | x | x+1 | 0 | 1 | x | 0 | x | 1 | x+1 |
x+1 | x+1 | x | 1 | 0 | x+1 | 0 | x+1 | x+1 | 0 |
Notice that for the multiplication table we do not have an inverse for x + 1 . This implies that F[x]/(x2 + 1) is not a field. Then, what restrictions are needed on f(x) to form a field? The question of reducibility and irreducibility of polynomials is crucial here. Forming a field depends on the irreducibility of a polynomial when moding out with a ring of polynomials. A function f(x) is said to be reducible if f(x) = a(x)b(x), where a(x), b(x) F[x] and the degrees of a(x) and b(x) are less than the degree of f(x).(20) If f(x) is not reducible then it is irreducible. This is a similar analogy to composite and prime numbers. For example, f (x) = 1 + x4 is a reducible polynomial:
1 + x4 = (1 + x ) ( 1 + x + x2 + x3 )
and k (x) = 1 + x + x4 is an irreducible polynomial:
1 + x + x4 = ( 1 + x + x2 ) ( x + x2 ) + 1
3. Cyclic Codes
Now that we have gone through some basic Field Theory, we will demonstrate how the idea of cyclic codes ties into all this. Recall that cyclic codes are codes that can be cyclically constructed from a single codeword. In addition, there is a correspondence between coefficients of these polynomials and codewords: concatenating the coefficients of polynomials forms our codewords. For instance, the codeword 110 corresponds to the polynomial x2 + x. To form new codewords the codeword 110 is cyclically shifted: {101, 011} yielding three codewords. Once the zero codeword is considered, the total number of codewords is increased to 4: {000,101,011,110}, corresponding to polynomials {0,x2 + 1, x + 1 ,x2 + x }. Notice that multiplying a polynomial codeword by x corresponds to a cyclic shift in our field F[x]/(x + 1). This operation gives clues as to what type of fields will be great candidates in constructing cyclic codes. We should consider fields such that multiplying by xn produces a cyclic shift in n positions. Consider the ring F[x] of polynomials in x with coefficients from the field F. F[x]/f(x) is the same ring, except that the entries are taken modulo f(x), where f(x) is a fixed polynomial of F[x]. When f(x) is irreducible over F, this ring becomes a field. Thus if F is GF[p], where p is a prime, and the degree of f(x) is m, F[x]/f(x) is just GF[pm].
Restricting ourselves to f(x) = xn - 1 we have xn 1 (mod(xn - 1)). At the same time we can reduce any polynomial modulo f(x) by replacing xn+0 with x0 = 1, xn+1 with x1 , ..., xn+m with xm , thus simplifying divisions with xn - 1. More generally, take any polynomial g(x) in F[x]/(xn - 1),
g(x) = a0 + a1 x + a2 x2 + ... + an-1xn-1
multiplying g(x) by x,
x * g(x) = x * ( a0 + a1 x + a2 x2 + ... + an-1xn-1 )
= a0 x + a1 x x+ a2 x2 x+ ... + an-1xn-1 x
= an-1 + a0 x + a1 x2 + ... + an-2xn-1
yields a new polynomial that is a "cyclic shift" of the original. It should be clear that an isomorphic correspondence exists here. So a code C in F[x]/(xn - 1) is a cyclic code if and only if C satisfies the following conditions:
i) a(x) , b(x) C then a(x) + b(x) C
ii)a(x) C and g(x) F[x]/(xn - 1) then g(x)*a(x) C
As a result, we call g(x) our generator polynomial since it generates the code C.(21) In addition we can form a check polynomial, h(x)by simply looking at another factor of xn - 1:
xn - 1 = g(x) h(x)
h(x) is said to be a check polynomial since any code word c(x) in C multiplied by h(x) results in 0:
If c(x) is in C then c(x) = a(x) g(x). So, c(x)h(x) = a(x)g(x)h(x) and c(x)h(x) = a(x)*0 = 0.
For example, if c(x)h(x) does not equal to zero then the remaining polynomial s(x) is called our syndrome.
4. Algebraic Perspective
Algebraically speaking what we have done so far is to construct an algebra of polynomials modulo ( xn - 1), looking at factors of (xn -1) and taking an irreducible monic polynomial to form a field. The elements of this algebra are residue classes of polynomials; the ideals of this algebra form cyclic codes. And due to the generating property of the algebra, we have every ideal in the algebra is a principal ideal.[7][8]
5. Generator and Check Matrix
In our previous constructions of the Golay code we had a generator matrix and check matrix. It would be very convenient if similar matrices for polynomial codes could be constructed. Due to the structure of the cyclic polynomial codes a generator matrix and a check matrix can be formed from the generator polynomial and check polynomial respectively. Given a cyclic code C with a generator polynomial g(x) = g0 + g1x + g2x + ... + grxr of degree r the generator matrix for C is

For example consider the generator polynomial g(x) = x + 1 that generates the code {0,x2 + 1, x + 1 ,x2 + x } corresponding to {000,101,011,110}. From above, our generator matrix will look like:



Using f(x) = x7 - 1, we have x7 - 1 =(x - 1)(x3 + x + 1)(x3 + x2 + 1) over GF(2). From our work above the polynomial g(x) = x3 + x2 + 1 generates the cyclic (7,4) code, called the generator polynomial: {x3 g(x) = 1101000,x2 g(x) = 0110100, x g(x) = 0011010, g(x) = 0001101}. In addition, the nullspace of the ideal generated by h(x) = (x -1)(x3 + x + 1) = (x4 + x3 + x2 + 1), called the check polynomial : {x2 h(x) = 1110100, x h(x) = 0111010, h(x) = 0011101}. This produces the generator matrix and check matrix:

7. Construction: Golay Code
Using f(x) = x23 - 1, we have the factors x23 - 1 = (x-1)(x11 + x10 + x6 + x5 + x4 + x2 + 1)(x11 + x9 + x7 + x6 + x5 + x + 1). Our generator polynomial becomes g(x) = (x11 + x10 + x6 + x5 + x4 + x2 + 1). From section 7.5 we form the generator matrix:
Notice that this generator
matrix is identical to the one
we constructed in Chapter 6.
Thus we conclude that G is
the generator matrix for the
(23,12) Golay code by
inspection.
Chapter 8 Perfect Codes
1. Historical Note/Motivation
A property that has a variety of isomorphic relationships to other areas of mathematics is the perfect code, a code that has every word associated to a codeword. This is best illustrated graphically:
The center dots are
codewords where we
have 24 codewords. The
dots surrounding the
center dot are words.
There are 112 words.
The lines represent the
distance between a word
and codeword. Since
each word is associated with only one codeword, the codeword can be thought of
as having some sphere of radius represented by the circles. The idea of minimum
distance comes from this sphere of radius.
Historically, it was Golay who first noticed the existence of Perfect Codes. He saw that the Hamming (7,4) code is a perfect code. The idea of Perfect Codes was an attractive idea to Golay since he had recently gone through an article by Shannon on rate of information. We would like to use as few check symbols as possible when coding information yet have the best error-correction. Thus increasing the rate of information depends how efficiently we code the information under a correction scheme. Golay then set out to find perfect codes.
2. Necessary Conditions
Perfect codes are a very efficient packing of spheres. In 1958 Golay came up with a necessary conditions for the existence of a perfect code:
n = ((pm)k - 1)/(pm - 1) , p is prime and k
where the length of the codeword is n with pm symbols.[4] Then he stated necessary condition for the existence of a perfect binary code:

3. Advances in Perfect Codes
Perfect codes have led to many other advances in different fields of mathematics and vice-versa. Perfect codes have played an important part in the study of Sphere Packings, Steiner systems, and Latin Squares.[7][8][9] One of the more interesting problems that was solved and shown that a 6-ary (7,65,3)-code does not exist, is the 36 officers problem posed in 1782 by Euler. The problem states:
There are 36 officers, one from each of 6 ranks from each of 6 regiments. Can these officers be arranged in a 6x6 square so that every row and every column of the square contains one officer of each rank and one officer of each regiment?[6]
It was not till 1901 when it was proved by Tarry exhaustively that there does not exists such a ranking. But in 1964 Golomb and Posner proved that the 36 officers problem is isomorphic to the 6-ary (7,65,3)-code. Hence the 6-ary (7,65,3)-code does not exist.
Another example of Perfect Code isomorphisms takes us to Sphere Packings. The sphere packing problem is given by:
"In Euclidean n-space, En, how may disjoint, open, congruent n-spheres be located to maximize the fraction of the volume of En that the n-spheres cover."[12]
For example, consider the following diagram of a "good" packing of circles in E2:
Now consider the following diagram of a "better or denser(22)" packing of circles in
E2.
Upper bounds have been determined on packings in different dimensions. It was
John Leech who discovered a connection between Perfect Codes and Sphere
Packings. Leech used the (24,12) Golay code to find a denser sphere packing of E24
than ever known. It is a 79% better packing than previously done.
Chapter 9 Concluding Remarks: "It's all about finite fields."
The (23,12) Golay code is a particularly interesting result in mathematics. This one code has not only a variety of applications but has a variety of different constructions. Other constructions include using Steiner systems, the Game of NIM, Quadratic Residues Chinese Remainder Theorem, and Roots of Polynomials.[7][9][13] With these constructions there seems to be a building of connections between them. In all of these constructions and the constructions of this thesis, the one common feature is that they can all be traced back to finite fields. With the interrelationships of these constructions many questions in different fields of Mathematics may be seen in a different light. As a result, exciting new problems and solutions can be formed.
Bibliography
[1] Cocke, J., Lossless Symbol Coding with Non-primes, I.R.E. (I.E.E.E.) Transactions Information Theory, 5 (1959), 33-34.
[2] Golay, M.J.E., Notes on Digital Coding, Proceedings, I.R.E.(I.E.E.E.), 37 (1949), 657.
[3] Golay, M.J.E., Binary Coding, Transactions I.R.E. (I.E.E.E.), PGIT-4 (1954), 23-28
[4] Golay, M.J.E., Notes on the Penny-Weighing Problems, Lossless Symbol Coding with Non-primes, etc., I.R.E. (I.E.E.E.) Transactions Information Theory, IT-4 (1958), 572.
[5] Hartnett, William E., Foundations of Coding Theory, D. Reidel Publishing Company, (1974)
[6] Hill, Raymond, A First Course in Coding Theory, Clarendon Press (1986)
[7] MacWilliams F. J. and Sloane N. J. A., The Theory of Error-Correcting Codes, North-Holland Publishing Company, (1981).
[8] Peterson, Wesley W. and Weldon, E. J. Jr., Error-Correcting Codes, The Colonial Press, Inc. (1972)
[9] Pless, Vera, Introduction to The Theory of Error-Correcting Codes, John Wiley & Sons, Inc., (1982).
[10] Pretzel, Oliver, Error-Correcting Codes and Finite Fields, Clarendon Press (1992).
[11] Shannon, C. E., A Mathematical Theory of Communication, Bell System Tech. J., 27(1948),379-423,623-656.
[12] Thompson, Thomas M., From Error-Correcting Codes Through Sphere Packings to Simple Groups, The Mathematical Association of America (1983).
[13] van Lint, J. H., Introduction to Coding Theory, Springer-Verlag New York Inc., (1982).
[14] Wiggert, Djimitri, Error-Control Coding and Applications, Artech House, Inc., (1981).
1. Richard W. Hamming. Hamming was a pure Mathematician who had access to the computers at Bell Telephone Laboratories to conduct mathematical computations.
2. Shannon C. E. . Shannon made many important contributions to the Field of Coding Theory.
3. Encoding is the process of coding information.
4. Once information is encoded it is called a codeword.
5. A binary sequence is even if it contains an even number of ones.
6. Xs mark the rows that fails the parity check.
7. The error position column is the column where the error has been detected.
8. Dr. Donald Goldberg. Dean of Faculty and Mathematics Professor at Occidental College.
9. The column of the Xs and circles in the previous diagrams is called the syndrome.
10. The zero vector corresponds to the column of all circles.
11. The message vector is the information we want to encode.
12. Note that C is just the nullspace.
13. m is the redundancy where m = n - k
14. The weight of a word is defined as the number of ones in the word. The smallest nonzero weight is equal to the minimum distance.
15. One of the Hamming code's special properties is that it is a perfect code.
16. The (23,12) Golay code is perfect.
17. Institute of Radio Engineers
18. S. K. Zaremba.
19. Shift registers are mechanical devices that can mimic the cyclic shifting of words..
20. f(x) is reducible if there exists a non-trivial factor
21. Note that g(x) is the monic polynomial of least degree.
22. The density of this packing is the fraction of E2 covered by the circles.