Autosophy Information Theory provides lossless data and video compression based on the data content

Klaus Holtz, Eric Holtz, Diana Kalienky

Omni Dimensional Networks,
Tel. Fax 415 474 - 4860,
Email: holtzk@autosophy.com
631 O’Farrell, Suite 1208, San Francisco, CA 94109 - 7429, USA

ABSTRACT

A new Autosophy information theory provides an alternative to the classical Shannon information theory. Using the new theory in communication networks provides both a high degree of lossless compression and virtually unbreakable encryption codes for network security. The bandwidth in a conventional Shannon communication is determined only by the data volume and the hardware parameters, such as image size; resolution; or frame rates in television. The data content, or what is shown on the screen, is irrelevant. In contrast, the bandwidth in Autosophy communication is determined only by data content, such as novelty and movement in television images. It is the data volume and hardware parameters that become irrelevant.

Basically, the new communication methods use prior “knowledge” of the data, stored in a library, to encode subsequent transmissions. The more “knowledge” stored in the libraries, the higher the potential compression ratio. “Information” is redefined as that which is not already known by the receiver. Everything already known is redundant and need not be re-transmitted. In a perfect communication each transmission code, called a “Tip”, creates a new “engram” of knowledge in the library in which each tip transmission can represent any amount of data. Autosophy theories provide six separate learning modes, or Omni Dimensional Networks, all of which can be used for data compression. The new information theory reveals the theoretical flaws of other data compression methods, including: the Huffman; Ziv Lempel; LZW codes and commercial compression codes such as V.42bis and MPEG-2.

INTRODUCTION

Philosophers, rather than engineers, have long pondered elusive concepts like “knowledge”, “meaning” and “information”. Because these concepts are both subjective and intangible they were thought to be outside of normal scientific research. Until quite recently “knowledge” could not be defined in precise mathematical terms for use in electronic devices. There was to be no place for “meaning” in science, computing or communications.

In 1948 Claude Shannon conformed to the prevailing traditions in “A mathematical theory of communications” by declaring that “information” has no “meaning” or that “meaning” is irrelevant to the engineering problem. He argued that any type of data can be communicated by first converting it into binary digits called “bits”. The binary bits represent “information” equivalent to simple Yes-No answers to hypothetical questions. The only purpose of the communication is to restore the original data pattern in the receiver. Since information “removes uncertainty”, the more bits being transmitted, the more accurate the reconstructed data pattern in the receiver. Any attempt to remove bits in the transmission, through data compression, must “increase uncertainty” and therefore result in inevitable data distortion or loss of resolution. The transmission bandwidth is determined only by the data volume and hardware parameters. The meaning of the data is irrelevant. According to Shannon the “information” contained in this paper, for example, can be measured by counting the number of text characters and multiplying them by an 8 bit ASCII code. Whether this paper contains random number sequences, nothing but blank spaces, or written text is irrelevant.

Shannon’s mechanistic information theory now totally dominates modern data communications and computing.

Programmed data processing computers manipulate “meaningless” data patterns using arithmetic and programming, while communication networks only transmit “meaningless” binary bit streams. And yet the following questions remain:

1) If Shannon’s binary bit streams truly represent “communications”, then why is communication in nature or between humans so entirely different? Normal communications use symbols, sounds, or images which have “meaning”. Other than artificial computer communications, almost no natural communication is done with binary bits or yes-no answers.

2) Isn’t “Meaningless information” a contradiction in terms ?. If a message contains no “meaning” then it is of no value to the receiver and represents only gibberish. Information not only contains “meaning” but is “meaning”. Binary bit streams have no “meaning” to human beings and therefore contain no true information.

3) Why is Shannon’s theory so totally unable to explain the functioning and learning processes of our own brain? Our brain obviously does not function like a programmed data processing computer. It extracts “meaning” from environmental data and accumulates “knowledge”, without requiring any external programming or arithmetic.

4) Why is “lossless” data compression not only possible but already the basis of a huge market? The V.42bis compression standard, for example, is now implemented in virtually all new modems. Data compression utilities are being used in most computers to double or even triple storage capacities in disc drives. Data compression is rapidly invading all forms of communications so that uncompressed data transmissions may soon become uncompetitive.

Enter “Autosophy”, an emerging science initially founded to explain “self-assembling structures”, such as: crystals, living trees, or societies in mathematical terms. Autosophy is a combination of two Greek words “autos” (self) and “sophia” (knowledge or wisdom) which together can be translated as “self-learning” or the understanding of oneself. Since 1974 this research has produced a mathematical theory of “learning” and a new “information theory” which offers an alternative to the classical Shannon information theory. Autosophy re-admits “meaning” and “information entities” back into science, but only as mathematical abstracts and engineering tools. It may open the way towards an entirely different communications technology and eventually to the design of self-learning brain-like “Autosopher”.

Comparing conventional data processing with Autosophy learning is like comparing manufacturing and farming. In a manufacturing plant, products are built from input materials using human or machine labor according to engineering design and planning. Every step in the process is under human control and supervision. Data processing computers similarly produce useful output data by combining input data using arithmetic and programming. Every step in the process is defined by a human programmer. Farming, in contrast, is done by depositing seeds into the soil and waiting for the results. The seeds will attract selected materials from a random environment to assemble themselves into very complex structures, such as crystals or living trees. The process is fully autonomous and does not require any human design or supervision. Similarly, an “autosopher” will absorb selected data from a random environment to grow self-learning data networks in electronic memories, i.e., “data crystals” or “data trees”. There is no need for human supervision or programming.

Self-assembling structures violate important laws of physics, especially the laws of entropy. According to classical physics, complex structures cannot assemble themselves without outside guidance or control. This is because conventional science only allows statistical randomness to explain the behavior of objects. Objects or particles are not supposed to have choices, directions or a will. And yet living trees do assemble themselves from random environmental molecules and more advanced creatures do evolve from more primitive creatures against the laws of entropy. Physics equations usually contain only the three elements: matter, forces and dimensions. Autosophy tries to solve the enigma by introducing “information” as a fourth element in the cosmos. Without “information” (genetic codes), living creatures cannot exist and entropy could only increase in the universe. Adding “information” (not the Shannon type information) to physics, chemistry and biology may result in a revolution in most of these natural sciences. It turns out that there are indeed “information entities” in most physical structures and there really is “meaning” in communications.

SHANNON VS. AUTOSOPHY COMMUNICATIONS

Shannon and Autosophy communications are entirely different, following entirely different natural laws. But, both achieve the same final result, the transmission of data or video in communication networks.

Figure 1. A typical Shannon data and video communication

The choice of which method to use depends on the data type and the application. Shannon communications are normally superior for meaningless or totally random data sequences, while Autosophy communications are highly superior for meaningful data, such as written text, speech and television. Because most communications networks are built to serve human communications needs, most network communications can be greatly improved by using Autosophy methods.

As shown in Fig. 1, a Shannon television transmission depends only on the data volume and hardware parameters. The images are formed by pixels in ordered rows and columns on the screen where each pixel must be constantly re-scanned and described in the transmission. The bandwidth (bits per second) in a digital television system can be calculated as the product of rows, columns, brightness resolution (bits per color), frame rate (frames per second) and colors (red green blue) in color television. All of these values are hardware “quantities” depending on the physical parameter in the system. The image actually shown on the screen is irrelevant, such that a totally random noise pattern would require just as many bits per second as a blank screen. Any increase in image size, resolution or scanning rates would require a much larger transmission bandwidth. Any attempt to remove bits, through data compression, leads to inevitable image distortions or reduced resolution.

In the theoretical system the input data is always regarded as “quantities” even for ASCII text transmissions. The input data is converted using Shannon’s equations into binary digits or bits. The only purpose of the communication is to reproduce the input data pattern in the receiver. According to Shannon’s theory, information “removes uncertainty” so that the more bits transmitted, the more accurate is the data pattern reconstruction in the receiver. Once again, removing bits from the transmission through data compression must “increase uncertainty” and lead to data distortions or loss of resolution.

The only recourse open to the design engineer is to hide the image distortions, caused by image compression, from the human observer. In the MPEG-2 compression standard, for example, 8 by 8 pixel tiles are converted into an 8 by 8 pattern of spatial harmonic brightness sine waves, using the cosine transforms. The most dominant harmonics are then selected by quantization, while the lesser harmonics are discarded. This distorts the images with blurring, visible tile boundaries and introduced image artifacts, such as light or dark lines. The image distortions will increase with the compression ratio until the image quality becomes unacceptable to the user. Distorted images may be acceptable to some degree in television but not for scientific or medical images. The MPEG-2 standard also reduces the frame rate by sending only a single compressed frame per second. The missing frames are simulated by the receiver using motion vectors and interpolation. This produces jerky and blurry motion within the images. Huffman coding is used to further compress the transmission by using shorter codes for the most frequently transmitted codes. This causes error propagation in which any transmission error can cause an image to break up into random noise until a recovery code is detected.

Figure 2. An Autosophy communication

As shown in Fig. 2, true communication is a process of copying knowledge from one autosopher to another autosopher. The ultimate purpose is to “create new knowledge” in the receiver so that “information entities” can multiply and spread themselves through the environment. Transporting data or video through communication networks is a special case involving a transmitter autosopher and a receiver autosopher. Information is communicated with address tokens, called “Tips”, each of which may represent any amount of data. All input data is treated as “Addresses” which, in contrast to Shannon’s quantities, have “meaning”. Both the transmitter and the receiver require a “hyperspace knowledge library” or memory, the construction of which is explained in the next chapters. There are six known different hyperspace library models or “Omni Dimensional Networks”, all of which can be used for data compression or different learning modes.

A “story” in a book is an example of an “information entity” copying itself into a reader’s mind. The story is not dependent on other physical elements: matter, forces or dimensions. However, all information entities require a physical structure or vessel, such as paper and ink in a book. The story has no mass or weight and scrambling the characters in a book does neither increase nor decrease the weight of a book. Once a story is printed in the book it neither consumes nor generates energy. The story does not occupy time or space being both everywhere and nowhere in the book. The story consists of symbols in a progression, such as characters or sentences in written text. During the reading process the story will copy itself into the reader’s mind, in effect creating new knowledge in the reader. Once the story has copied itself into the reader’s mind it will combine with other stories already in the mind to influence his behavior towards a political ideology, a new religious believe or computer expertise. Information entities may reside in books, in physical structures or as genetic codes in living beings. They provide direction and order in the cosmos, ultimately restoring the order lost by entropy. Only information entities evolve towards more complex entities. Physical or living creatures only serve as temporary vessels.

The efficiency and compression ratio of an autosophy communication depend on the shared knowledge in the transmitter and the receiver. The more shared knowledge, the more efficient the communication and the higher the potential compression ratio. Only the amount of “shared” knowledge in the transmitter and receiver is relevant; not the total amount of knowledge in the libraries. Humans, for example, communicate in a different manner with children who have little knowledge, than they do when communicating with experts in a field where both have a lot of shared knowledge. Communications between experts in a field is much more efficient than communication between children. If no shared knowledge exist, for example when communicating in an unknown foreign language, then communication is very difficult. Communication will become progressively more efficient as a common set of symbols is learned.

A communication “Tip”, in order to contain true “information”, must contain something not already known by the receiver, which relates to something already known by the receiver. In a communication every tip transmission could generate a new “engram” of knowledge in the receiver. An “Engram” is a unit or quantum of knowledge which, according to autosophy, is equivalent to a location in omni dimensional hyperspace. For example, telling a person something he already knows does not contain “information”. Telling a person something completely unrelated to already established knowledge is gibberish and again contains no “information”. Gibberish may be caused by unknown symbols, such as words in a foreign language, or concepts too advanced for the receiver. In a perfect communication without repetitions or gibberish each word or sentence would create a new engram of knowledge in the receiver as a minimal extension to that which is already known by the receiver. The learning process may be imagined like the growing of “data crystals” or “data trees” in an electronic memory, one node or one branch at a time.

Figure 3. Comparing Shannon and Autosophy information theories

As shown in Fig. 3, Shannon and Autosophy information theories and communications are based on entirely different concepts and lead to entirely different technologies. In autosophy all input data (such as text character or pixel brightness values) are treated as “Addresses” which have “meaning” rather than meaningless “quantities” in Shannon’s theory. Shannon in effect translates quantities and symbols into binary numbers which are easier to transmit. Addresses are learned in a hyperspace knowledge store or library in which the more knowledge already stored, the higher the data compression. Unlike conventional data processing, arithmetic is seldom used in an autosopher. Transmission is with Tips or address tokens, each of which may represent any amount of data, for example, any number of characters in text or any size portion of the image in television. In a Shannon communication the bit rate depends only on the data volume or hardware, while the data content is irrelevant. In an Autosophy communication the bit rate only depends on the data content, while the data volume or hardware parameters become irrelevant. The purpose of a Shannon communication is to “Remove Uncertainty” in the receiver, while the purpose of an Autosophy communication is to “Create new knowledge” in the receiver.

Shannon data storage is linear and dependent on the bits and bytes in the data record. Autosophy data storage depends only on the meaning of the data, which is stored in Omni Dimensional Hyperspace. The more data already stored in the hyperspace record, the less additional storage is required to store additional data records. One cannot learn what one already knows. Shannon uses equations to express bandwidth and information, while autosophy only uses algorithms. An Autosopher database is educated very much like a human child, a process which requires no programming or data processing. Shannon’s theories will ultimately lead to the programmed data processing “computer”, while the new Autosophy information theory will ultimately lead to a next generation of brain-like self-learning Autosopher.

THE RULES OF LEARNING

“Learning” is defined as the accumulation of “knowledge”, Since communication depends on prior knowledge of the data, the learning process must be defined mathematically to be implemented in electronic autosopher. The learning process may be imagined like “data crystals” or “data trees” growing in an electronic memory, one engram or one node at a time.

1 All input data is treated as “addresses” which define “meaning”. Addresses cannot be successfully used in arithmetic or data processing, so conventional computing is unsuitable in an autosopher. Written text for example consist of characters, words and sentences which have meaning. Using arithmetic to understand written text will not be very successful. Our own brain is an autosopher which extracts meaning from environmental data while hardly ever using arithmetic.

2 Communication “Tips” may represent increasing amounts of data depending on the amount of shared knowledge in the transmitter and receiver. The total amount of knowledge is much less important then the amount of shared knowledge. For example, two very intelligent persons may not be able to communicate if they do not speak the same language. Communications between adults, which have more knowledge, are usually much more efficient then communication between children, which have less knowledge.

3 Communications symbols (Tips) may form progressions in which the same rules may apply on many levels. In written text for example:

serial bit sequences will form ASCII character codes in a fixed length 8 bit code;

serial sequences of character codes terminated by a space character will form words;

serial sequences of words terminated by a full stop will form sentences;

serial sequences of sentences terminated by a new line will form paragraphs;

while serial sequences of paragraphs terminated by a new heading will form sections.

Smaller or lower level symbols assemble in the same manner into higher and higher level symbols. Each string of lower level symbols is terminated by a specific “end of sequence” code or by a fixed string length limit, to form a higher level symbol code. The many levels in the progression may be encoded using the same learning algorithms. All symbol codes on all levels can be interleaved in the same storage device. Since symbol or tip codes are used in the communication, any tip code may represent any amount of data, from single ASCII character to whole sections. There are six different kinds of leaning modes or Omni Dimensional Networks each of which forms a different kind of progression.

4 All symbols (Tips) can be defined like points in omni dimensional hyperspace. “Learning” can be treated mathematically as the defining locations in a space with arbitrarily many dimensions. While this process is very complex it can be defined by simple algorithms. Following the steps in the algorithms will implement learning in electronic autosopher. Each of the six known learning modes defines locations, using a different algorithm, in a different hyperspace model.

5 An “engram” is the smallest unit or quantum of knowledge equal to a defined location in omni dimensional hyperspace. The amount of knowledge stored in an autosopher is equal to the number of nodes in its library. Each node is unique in the library and can be defined only once. This is because one cannot learn what one already knows. Each engram may represent any amount of data, from single ASCII character to whole sections in text, or any sized portion of an image in television. The “meaning” of a symbol is specified by locating its location in omni dimensional hyperspace.

6 A new engram of knowledge is created in an autosopher only if the input information is not already known and relates to already established knowledge so that it represents a minimal extension to what is already known. Communications which are already known do not contain information. Communications which do not relate to established knowledge are gibberish and again contain no information. The bandwidth of an autosophy communication is based on the novelty and movement in the data or television images. Everything which is not new or which does not move contains no information and need not be communicated.

7 Different learning modes are based on different numbering systems. This may be used to discover additional learning modes in the future. Each learning mode defines quantities or addresses in a different way. The six already known learning modes are: serial (text learning), parallel (images, television), associative (database associations), interrelational (language understanding), logical (logical reasoning) and primary (unstructured access). Only the most primitive, the “serial” network, will be explained in the next chapter.

SERIAL SELF-LEARNING AUTOSOPHY TREE NETWORKS

Self-learning Autosophy tree networks may grow from text, speech or images in an electronic memory, like “data trees” or “data crystals”. The process is guided by simple algorithms and does not require programming or outside supervision. The result is true mathematical “learning” that is a mathematical analog to the learning in our own brain.

Figure 4. A serial Autosophy tree network

Fig. 4 illustrates the serial network used in text learning. The same network may also be adapted with slight variations for speech or images. In written text the input data consist of ASCII characters and the “end of sequence” code is a “space” character. In speech the input data consists of analog samples of the voice and the “end of sequence” code is a zero crossing of the analog wave. In images the data consist of a spiral scanning sequences and the “end of sequence” is a fixed length string limit.

The tree network consist of identical nodes where each node represents an engram of knowledge. A GATE represents the new information which in text is an ASCII character. A backward POINTER points to the previous node in the tree which represents the reference to the already established knowledge. The ADDRESS represents an engram as a point in omni dimensional hyperspace. Each ADDRESS is unique and can be defined only once because one cannot learn what one already knows. A combination of GATE - POINTER is called a MATRIX. What is finally stored in the memory is true mathematical knowledge. The network starts growing from a pre-defined SEED node. Several network levels in a progression may be interleaved in the same memory device by reserving a separate SEED node for each level. The first 256 memory addresses may also be reserved as SEEDS for the first string character so that each ASCII character forms its own sub-tree.

SERIAL NETWORK GENERATION ROUTINE

MATRIX: [ GATE ] POINTER ]

Start: Set POINTER = SEED = 0.

Loop: Move the next input character into the GATE.

If the GATE is an “end of sequence” (an ASCII space) then use the POINTER as output code; Goto Start.

Else search the library memory for a matching MATRIX.

If a matching MATRIX is found then move the ADDRESS where it was found to the POINTER; Goto Loop.

Else, if a matching MATRIX is not found, then store the present MATRIX into a next empty memory ADDRESS;

Move the memory ADDRESS where it was stored to the POINTER; Goto Loop.

SERIAL NETWORK RETRIEVAL ROUTINE

MATRIX: [ GATE ] POINTER ]

Start: Move the input code to the POINTER.

Loop: Use the POINTER as a memory ADDRESS to fetch a new MATRIX from the library.

Store, push the new GATE into a First-In-Last-Out (FILO) stack.

If the new POINTER = SEED = 0 then pull the output data from the FILO stack; Goto Start.

Else Goto Loop.

Each operation is started by combining the first input GATE character with the SEED POINTER to obtain a first MATRIX. The memory is then searched to find a matching MATRIX already stored in the library. If a matching MATRIX is found then the memory ADDRESS in which it was found is used as the next POINTER in the MATRIX. If a matching MATRIX is not found, then the current MATRIX is stored into a next empty memory location and the ADDRESS into which it was stored is used as the new POINTER in the MATRIX. The same steps are repeated for each input character in the string or text word. Eventually an “end of sequence” terminates the operation. This “end of sequence” code may be an ASCII “space” character in text or a maximum string length limit. The last node ADDRESS at the “Tip” of the branch defines the entire input string which is used in compressed communications. The receiver, which has an identical tree library, retrieves the string in reverse order by following the POINTER trail back to the SEED node. In a progression of symbols the code outputs from a lower level network are fed as input codes to the next higher level network. All networks in the progression may use the same basic algorithm and share the same memory device using a separate SEED node. In data retrieval each “Tip” address is followed backward to the SEED node of that network level in the progression. The output codes retrieved from a higher level network are fed as input codes to the next lower level network until the final data is retrieved as output from the lowest level network. The “end of sequence” codes for each level must be added to the output codes from each level during retrieval.

THE EVOLUTION OF LOSSLESS DATA COMPRESSION TECHNIQUES

The history of data compression is not a linear evolution but a twisted path followed by several independent researchers. David Huffman in 1952 invented a first lossless data compression method (Patent 3,694,813) based on the relative character frequencies in a communication. Klaus Holtz in 1974 revealed the Huffman code as only a primitive example of an autosophy communication, opening the way to more advanced codes. Tree network data compression based on the new autosophy information theory was first described in a patent application in 1975 by Klaus Holtz (Patent 4,366,551). A similar trail of research was started by Jacob Ziv and Abraham Lempel in 1977 and published just one month after the first Holtz publication. The Ziv-Lempel paper, “A Universal Algorithm for Sequential Data Compression”, describes an algorithm which is now known as the LZ-1 (Lempel Ziv #1) or LZ-77 (published in 1977) code. While Ziv and Lempel sought no patent protection, Douglas Whiting later received important improvement patents (Patent 5,003,307 and 5,016,009). In 1978 Ziv and Lempel published an improved algorithm “Compression of Individual Sequences via Variable Rate Coding”, which is now known as the LZ-2 (Lempel Ziv #2) or the LZ-78 (published in 1978) code. The LZ-78 code builds a self-growing library by adding, for each transmission, a new string to the library that consists of a previous string appended by a new character. This code is rarely implemented because of its ever increasing string length in the library. In 1981 Willard Eastman together with Ziv and Lempel (Patent 4,464,650) built a tree network using address multiplication. In 1984 Terry Welch (Patent 4,558,302) improved on the Eastman code in a paper “A Technique for high-performance Data Compression”. The LZW (Lempel Ziv Welch) code in effect re-discovered a primitive version of the Holtz serial tree network ten years later. Instead of terminating a string by an “end of sequence” code the LZW code terminates a string by a “first not found node”. In 1986 Victor Miller (Patent 4,814,746) improved the code by adding “delayed innovation” and a recycling library. In 1990 British Telecom established the international V.42bis standard for modems. The standard is a combination of many inventions and improvements. Data compression is based on the serial tree networks discovered by Holtz (Patent 4,366,551) and modified by Welch (Patent 4,558,303). Delayed innovation and a limited recycling library is used according to Miller (Patent 4,814,746). A fast tree searching method, originally described by Edward Sussenguth in 1963, is used to accelerate library searching. More advanced data compression methods, including image and video compression, are now being developed by many researchers.

The following will explain the logical evolution of lossless data compression applications, from the most primitive to advanced future implementations, but not necessarily in chronological order. It will point out logical flaws in the older codes that may be used to improve the compression efficiency.

Figure 5. The Huffman code

The Huffman code (Fig. 5) first creates a table showing the relative frequency of each character in the transmission. The least often used character frequencies are then combined to form a new table entry, which is again combined with the lowest character frequencies until each character has an assigned code. The result is a table in which the most often used characters have a shorter code than less often used characters. In a transmission system the ASCII input characters are used as an address to fetch variable bit length codes from the table. The receiver has a much larger table in which most locations are empty while some locations contain an output ASCII character. The table is addressed in a tree fashion, using the transmitted bits as an address. The first input bit is used as the most significant address bit, with the remaining address all at zero. If an ASCII character is stored in the addressed table location then the character is used as output data and the routine is restarted with the next transmitted bit. If the addressed table location is empty, then the next transmitted bit is used as the next lower significant address bit and the table is accessed again to see whether an ASCII character is stored there. Each input bit sequence will eventually lead to a stored ASCII character in the list.

Huffman and Shannon argued that rare or surprising communications contain more “information” than common or expected communications. The Huffman code is a typical but primitive autosophy communication. It uses the input ASCII characters and the transmitted bits as “addresses” rather than binary “quantities” as in a Shannon communication. Prior “knowledge” of the data is required in the form of character frequency tables. If the frequency tables are wrong, because of changing data types, then “data expansion” may occur producing larger output files then the original input files. A single bit error in the transmission will scramble all future transmissions, an effect called “error propagation”. Because the relative character frequency statistics represent only very scant “knowledge” of the data, the compression ratios are modest.

Figure 6. The Ziv Lempel 77 (LZ-77) code

The LZ-77 code (Fig. 6) requires a library, implemented as a character shift register, in both the transmitter and in the receiver. The input data to be transmitted is shifted into the transmitter’s library, while the retrieved output data is shifted into the receiver’s library. The two libraries must remain identical during the entire transmission. In a compressed communication the input strings are compared with the library to find the “longest matching string” in the library. This longest matching string is identified with a START code pointing to the first matching character in the string. A string LENGTH code is added to count the number of matching characters in the string. For example, the word ROSE may still be in the library from a previous transmission. Transmitting the word ROSE would require a string START code pointing at the first character R and a string LENGTH code 4 counting the number of matching characters. The input word ROSE is then shifted into the transmitters library. The receiver uses the transmitted START - LENGTH code to retrieve the word ROSE from its own library and subsequently shift the output word ROSE into its own library so that both libraries remain identical. A shift register library of 4k characters requires a 12 bit string START code. Limiting the string length to 16 characters would require a 4 bit LENGTH code. A 16 bit START - LENGTH code transmission could represent up to 16 text characters or a maximum of 8:1 compression compared to normal 8 bit per ASCII character transmissions. The normal compression ratio is less than 2:1.

Ziv and Lempel based data compression on “complexity” and previously transmitted “data”, instead of previous “knowledge” as in the autosophy theories. This theoretical flaw may explain the inferior performance of the algorithm. While previous “data” may contain “knowledge”, the two concepts are not identical. A shift register library is very inefficient; often storing identical strings in several copies. If each string is stored only once in the library, then many more strings can be stored in a smaller library. This would increase the chances of finding matching strings and lead to higher data compression. Storing “knowledge” in a tree networks also eliminates the need for a string length code. Short data files cannot be compressed because few matching strings are likely to be found in a mostly empty library. Pre-loading the library with a known data pattern before transmission would solve this problem. Pre-loading the library with a secret pattern known only to authorized users results in a virtually unbreakable “codebook” encryption method. If no matching strings are found, then each character must be transmitted as a 9 bit code leading to a slight possible data expansion. A single error in the transmission makes the two libraries different, scrambling all subsequent transmissions through error propagation. In spite of its inferior performance compared with newer algorithms, the LZ-77 code is still the most often used compression algorithm.

Figure 7. The Ziv Lempel Welch (LZW) code

The LZW code, shown in Fig. 7, starts from an empty library where the first 256 nodes are reserved to act as SEED nodes for each possible 8 bit character combination. This in effect creates a separate sub tree for each possible input character. The first input character in the string is used as a POINTER code pointing to the first 256 memory addresses. The next input character GATE is combined with the POINTER forming a MATRIX. The library memory is searched to find a matching MATRIX. If the matching MATRIX is found then the ADDRESS in which it was found is used as POINTER in the next MATRIX. If a matching MATRIX is not found, then the string is terminated and the present MATRIX is stored in a next empty memory ADDRESS. The next input character is used as the first character of the next string. Both the GATE and the POINTER in the present MATRIX are transmitted in the output code. The receiver stores the received code in a next empty memory location in its own library and then retrieve the output data string by following the POINTER trail back to the first 256 nodes.

LZW ENCODING ROUTINE

MATRIX: [ GATE ] POINTER ]

Start: Reserve the first 256 memory locations ( 0 to 255) for the first character SEED.

Loop: Clear the POINTER in the MATRIX.

Loop1: Move the next input character to the GATE.

If the POINTER = clear then move the GATE to the POINTER; Goto Loop1

Else search the library memory to locate a matching MATRIX.

If a matching MATRIX is found then move the memory ADDRESS to the POINTER; Goto Loop1.

Else store the MATRIX into a next empty memory ADDRESS;

Use the present MATRIX as the output transmission code; Goto Loop.

LZW RETRIEVAL ROUTINE

MATRIX: [ GATE ] POINTER ]

Start: Reserve the first 256 memory locations ( 0 to 255) as final SEED nodes.

Loop: Move the received input code into the MATRIX.

Store the present MATRIX into a next empty memory ADDRESS.

Push the GATE into a First-In-Last-Out (FILO) stack.

Loop1: Use the POINTER as a memory ADDRESS to fetch a new MATRIX from the library.

Push the new GATE into the FILO stack.

If the new POINTER is less than 256 then push the new POINTER into the FILO stack;

Retrieve, pull the output data from the FILO stack; Goto Loop.

Else Goto Loop1.

The LZW code is a serial autosophy tree network in which each string is terminated by the first “not found node” instead of an “end of sequence” code. Strings or text words are chopped into random fragments leading to a disorganized and inefficient “string chop suey” library. Each transmission creates a new node in both libraries. The libraries will keep on growing during a transmission until they becomes unmanageable. A solution is to clear the library at a preset limit and grow a new library. Short or random bit files cannot be compressed because few matching strings are found in a mostly empty or random library. A pre-loaded library may be used for data encryption. A single transmission error can scramble the remaining transmissions through error propagation. Overall the LZW code performs only slightly better than the LZ-77 code. The original code had to be improved before it became commercially viable.

Figure 8. The V.42bis compression standard using delayed innovation

The international V.42bis compression standard (Fig. 8) improves the LZW code through “delayed innovation” and a limited recycling library memory. Only the POINTER code is transmitted, while the next or un-matched character is used to start the next string. The transmitter stores the not found node in a next empty memory ADDRESS and then transmit the POINTER only. The receiver retrieves the string from the POINTER code and remembers the POINTER in a buffer. The next POINTER transmission contains as its first character the missing GATE, which is combined with the POINTER in the buffer to create a new node in the receiver library. A rare problem exists because the transmitter adds a library node one transmission before the receiver can create the same node. If the transmitter uses the new node in encoding the next string, then the algorithm can crash because the receiver does not yet has that node. The V.42bis standard has precautionary steps to avoid that problem. Instead of an endlessly growing library, the V.42bis standard has a fixed length library in which the “least recently used” node address is cleared and recycled to store the next node. A few addresses above the first 256 nodes are used as special command messages rather than data codes. The ETM (256) code switches the compression engine off. The compression efficiency is continuously monitored by an embedded microprocessor. If the compression ratio becomes too low, then the compression engine is switched off and replaced by clear 8 bit character transmissions. The compression engine may be switched back on later when the data becomes compressible again. This avoids any possible data expansion. The FLUSH (257) code is used to clear the entire library to allow a new library to grow from the following transmissions. The STEPUP (258) command increases the POINTER bit length by one bit. Starting with a 9 bit POINTER code (node addresses 259 to 511), the bit length is increased whenever the library doubles at 512, 1k, 2k and 4k. The V.42bis standard stuffs either POINTER codes or 8 bit character codes into packets for transmissions. The packets contain error checking codes and defective packets are re-transmitted until they are received without errors. This avoids the problem of error propagation.

Figure 9. Sussenguth tree searching in V.42bis

The V.42bis standard uses a forward pointing tree searching algorithm (shown in Fig.9) to increase encoding speed. Instead of having to search the entire library memory, this algorithm limits the search to a maximum of 256 steps. Both the Sussenguth and the LZW trees are combined in the same node. While the Sussenguth tree accelerates encoding speed the LZW tree accelerates retrieval speed. The GATE field contains the character which is shared by both trees. A DEPENDENT forward pointer points to the next node ADDRESS if the input character matches the GATE code. If the input character does not match the GATE code, then the SIBLING pointer is followed to the next node ADDRESS. The algorithm would follow the SIBLING pointer trail until a matching GATE is found. If no matching GATE is found at the end of the trail, indicated by an empty SIBLING pointer, then a new node is created at a next empty or recycled memory ADDRESS. The previous node SIBLING pointer is loaded with the new node ADDRESS to cut and relink the trail.

The V.42bis standard is a state of the art algorithm, implemented in virtually all new modems, and which can compress most data to some extend. It avoids most data expansion and eliminates the danger of error propagation by re-transmitting defective packets. The algorithm is usually implemented by embedded microprocessors. The library size, packet size and baud rate are negotiated by the modems before each transmission. The Sussenguth search tree allows data transmission speeds sufficient for telephone modems and ISDN though not much beyond that. Truly high speed transmissions via local or remote area networks (including fiber optic transmissions and real time file compression in a computer) are not possible with today’s microprocessors. Encrypted data cannot be compressed and no provisions for data encryption are provided in the standard. Because of the inefficient “string chop suey” library in the LZW code the compression performance remains mediocre.

Figure 10. Fixed tree library compression

A fixed library system (shown in Fig. 10) uses a fixed tree network to store a dictionary of words in a known language. The library may be generated in a laboratory for use in generic open communications. This library then becomes a part of a communications software package, such as an Internet communications package, or the library can be frozen in a Read Only version in a hardware compression chip set. A secret library may also be grown by the user for encrypted communications. A library copy is then provided to authorized users only in the form of floppy discs, PCMCIA cards or as a library download via the Internet. An encryption library provides virtually absolute communications security, via public networks such as the Internet or cellular telephones. It also provides an impenetrable barrier against unauthorized access to compressed computer files.

A library is generated by the algorithm shown below, from old text files. A different library may be grown for each foreign language. The library is usually grown by a software package in a PC and requires about 30 Minutes for completion. Old text files are fed to the algorithm to generate a first pass tree network. Only text words are learned where all characters are converted to lower case ASCII. The output word codes (tips) from the tree network are fed to a bubble sorting list. If the word code is already stored in the list then it is swapped with the code one location higher in the list, unless it is already at the top of the list. New word codes are added to the bottom of the list. In this way each text word is stored only once and the most often used words tend to migrate towards the top of the list. The word codes are finally retrieved as clear text words through the tree network starting from the top of the list and stored in a computer file. The text words can then be edited manually to remove word endings (such as “-ing” from verbs or “-ation” from nouns) and to split up compound words. This edited list is then used in a second pass to grow the final tree network library. The library may be topped off to 8k nodes by a word list provided in the software package.

LIBRARY GENERATION ROUTINE

MATRIX [ GATE ] POINTER ]

Start: Pre-load locations 0 to 255 with GATE = ASCII code, POINTER = 0.

Loop: Set POINTER = SEED = 0; Set string length = 0.

Loop1: Move the next input character into the GATE; Increment the string length count

If the GATE is not alphabetic (A to Z) then output the POINTER to the BUBBLE LIST; Goto Loop.

Else convert the GATE to lower case ASCII (set ASCII bit 6 = 1).

Search the memory for a matching MATRIX.

If a matching MATRIX is found then move the memory ADDRESS to the POINTER; Goto Loop1.

Else store the present MATRIX in a next empty memory ADDRESS; Move ADDRESS to POINTER.

If string length = 16 then output the POINTER to the BUBBLE LIST; Goto Loop.

If POINTER = 8191 (8k memory full) then end of learning; Goto Exit.

Else Goto Loop1.

FIXED LIBRARY ENCODING ROUTINE

MATRIX [ GATE (8 bit) ] POINTER (13 bit) ] SPACE (1 bit) ]

Start: Clear the POINTER and the SPACE bit.

Loop: Move the next input character into the GATE.

If the GATE is not alphabetic (A to Z) or an ASCII “space” then:

If POINTER = clear then output the GATE code (9 bit); Goto Start.

Else output the POINTER, SPACE = 0; Then output the GATE (9 bit); Goto Start.

Else if GATE = ASCII “space” then output the POINTER, SPACE = 1; Goto Start.

Loop1: Search the memory for a matching MATRIX (use a CAM for fast search).

If a matching MATRIX is found then move the memory ADDRESS to the POINTER; Goto Loop.

Else output POINTER, SPACE = 0; Clear POINTER and SPACE; Goto Loop1.

FIXED LIBRARY RETRIEVAL ROUTINE

MATRIX [ GATE (8 bit) ] POINTER (13 bit) ] SPACE (1 bit) ]

Start: Move the input code to the POINTER, SPACE.

If the POINTER is a 9 bit single character code then output the data character; Goto Start.

If SPACE = 1 then push an ASCII “space” into a First-In-Last-Out (FILO) stack.

Loop: Use the POINTER as a memory ADDRESS to fetch a new MATRIX from the library.

Push the new GATE into the FILO stack.

If the new POINTER = 0 then retrieve, pull the output data from the FILO stack; Goto Start.

Else Goto Loop.

Test results show that an 8k node library is equivalent to about 130 Thousand characters of random text (as used in the LZ-77 code) and that it contains about 6000 of the most common words used in English communications. About 97% of the words used in a normal Internet communication are contained in the list. Words later not found in the library are automatically chopped into known fragments for transmission. This may require several transmission codes for an unknown word, but such word fragmentation will not significantly reduce the compression because of the relative rarity of such words.

A text compression and encryption system can be implemented on a single silicon chip to improve communications in modems or to compress disc files in a PC. A universal chip set can be used for open communications while a downloadable library can be used for encrypted communications. A Content Addressable Memory (CAM) can be used to increase encoding and retrieval speed to more than 50 Million characters per second, fast enough for virtually any application. A fixed library system does not suffer from error propagation and transmission errors will only cause local data errors. Data compression is active immediately, providing high data compression for any length file transmission.

A communication protocol may use a 9 bit code for incompressible single characters and an 18 bit Tip code which usually represents a whole text word including the space character. A CONTINUE bit separates single character codes from Tip codes. This bit may be a modified parity bit in storage devices or a modified stop bit in RS-232 protocols. For totally random bit files, the maximum data expansion is about 12% while the average compression ratio for English text is between 2.8 : 1 and 3.3 : 1. A “space” (S) bit is used to indicate whole word codes or word fragments. The system can be improved through the use of a dynamic fragment learning library, which learns word fragments during a transmission for later encoding. For normal text transmissions such additional hardware has been shown to only marginally improve the compression ratio. Two extra bits, the A and the F bit, are used to identify capital or upper case ASCII codes. This improves the library efficiency because only lower case letters are encoded in the tree library. If both bits are set then the Tip code represents a special message to the receiver. This is used to insert error detection codes into the transmission for automatic error checking in the receiver. It can also identify strings of identical characters or special messages to the receiver program.

Figure 11. Autosophy image and video compression

Lossless autosophy image and video compression, shown in Fig. 11, is being developed for use in the Internet, video-conferencing and High Definition Television (HDTV). The hyperspace library contains a serial network which is adapted for image compression. The same basic scheme may be adapted for compressing static images or moving television.

A first step is to generate a fixed tree library adapted for image compression. An imaginary phantom library contains the first two pixel brightness values or the SEED for each string.. This library contains all the possible brightness combinations of two 8 bit pixel. Since the library only contains its own address no actual memory is required. A phantom library node is identified by the most significant address bit equal to zero. The second 64k library section contains the actual fixed tree library where each node is identified by the most significant address bit equal to one. This library contains many thousands of pixel brightness strings which are pre-learned from normal input images, from a TV camera or from pictures on a CDROM. The 64k by 25 bit output library is finally made a part of a communication software package or made available only to selected users for encrypted communications. A single chip hardware library, containing a Content Addressable Read Only Memory (CAROM), is used for real time television. The library may contain 8 bit per pixel absolute brightness values. The same library is shared for all colors (red green blue). A more efficient library may contain brightness differences, either the difference between the pixel and the absolute center pixel brightness, or the difference between the pixel and its preceding pixel. This highly compact library can be shared for encoding any pattern regardless of the overall brightness level in the images.

A superpixel or tip code describes a small tile on the screen which is scanned in a fixed spiral pattern. The scanning sequence starts from a center pixel screen address. For static images the center pixel screen address for each tile is computed or taken from look up tables depending on the screen format. For moving television the center pixel screen address is a pixel whose brightness value has changed since the previous input scan. To avoid any hardware dependency, i.e. the number of rows and columns on the screen, relative scanning is used. To identify a pixel on the right the screen address is incremented while for a pixel on the left the screen address is decremented. To identify a pixel below a fixed column constant is added to the screen address while for a pixel above a fixed column constant is subtracted. If the scanning sequence hits a physical screen limit, either left-right or above-below, then the scanning sequence is terminated. This allows each television camera or monitor to have a different number of rows and columns while communicating in a common format. The column constant is known by each camera or monitor and depending on screen size and resolution. Relative scanning allows television cameras and monitors to evolve into larger and larger images while maintaining a universal and backward compatible communications protocol.

An image buffer memory holds the absolute brightness values of the input image to be encoded. Each of the three colors (red green blue) requires its own image buffer. The input images from the television camera are scanned into the image buffer for encoding in the transmitter. In the receiver the output images are scanned from the image buffer to a monitor at regular scanning rates. Since only the moving portions of the images are selectively transmitted the scanning rates in the transmitter and in the receiver need not be identical.

In television a change buffer memory holds the screen addresses of the pixels that have changed since the last input scan. While a new image is scanned from the camera each pixel brightness is compared with the pixel brightness in the image buffer. If the brightness has not changed, within a threshold limit, then the input pixel is ignored. If the pixel brightness has changed then the new pixel brightness is entered into the image buffer. The screen address of that changed pixel is then stored in the change buffer. After the input scan the change buffer would contain all the screen addresses of the changed pixels.

For static image transmissions or storage an input image is first scanned into the image buffer. The image is divided into tiles by computing the center address for each tile. Each tile is then encoded into strings, by spiral scanning around the center address. The strings are then encoded into hyperspace INDEX codes using a fixed tree library. Each string is terminated either by a fixed string length limit or by a not found node. Not found strings are encoded by string fragments until the string limit is reached. The transmitted 17 bit INDEX codes could each represent between 2 and 24 pixels. The compression ratio would range from a 6% data expansion for totally random patterns to 11 : 1 compression for blank images. All three colors would use the same fixed library.

For moving television the input images from the camera are compared pixel by pixel with the previous image in the image buffer to identify the locations of changed pixels in the change buffer. Each address in the change buffer is then encoded into INDEX codes, by relative spiral scanning, using a fixed tree library. Since change in moving television is caused by large moving objects in the images, the changed pixels usually form clusters around the leading and trailing edges. Many changed pixels in a cluster may be encoded into a single INDEX code by erasing the pixel addresses in the change buffer which are identified in the spiral scanning sequence. Each tip transmission contains a (17 bit) INDEX code and a (20 bit) LOCATION on the screen code which identifies the center pixel (the pixel that has changed since the last scan). A more efficient library may contain brightness differences which are shared for all light or dark image sections. An absolute pixel BRIGHTNESS (8 bit) is then added to the transmission code. Each transmission tip then represents at least 3 pixels.

Comparing the MPEG-2 compression method with Autosophy compression reveals the following differences:

1) MPEG-2 and other Shannon type transmissions require a fixed hardware standard for screen size and resolution. Autosophy transmissions, in contrast, can be made totally independent of the physical hardware. This allows television technology to freely evolve into larger images, with better resolution, while maintaining a common backward compatible transmission standard.

2) In MPEG-2 the image quality is determined by the required compression ratio. The higher the compression the worse the image quality until the images become unacceptable to the user. Autosophy in contrast offers entirely lossless compression which will not distort the images. Because random background noise is interpreted as movement any increase in the noise level will increase the transmission. Better image quality with less noise will actually reduce the transmissions.

3) Shannon or MPEG-2 transmission require a fixed bandwidth channel which is determined by the systems hardware. High speed fixed bandwidth channels are expensive in packet switching networks. In Autosophy television, in contrast, transmissions depend only on novelty and movement in the images. Slow moving or familiar image will produce few transmissions while rapidly moving complex images will produce more transmission codes. Autosophy transmissions are ideally suitable for the new packet switching networks, such as the Internet or ATM.

4) Any transmission error in MPEG-2 will cause the images to break up into random noise producing very disturbing visual effects. In Autosophy television, in contrast, transmission errors will cause only a temporary freezing of motion in small image portions. Re-transmitting defective packets will correct any visual effects.

5) MPEG-2 can be encrypted but only with additional hardware. Autosophy provides a virtually unbreakable encryption, without extra cost, by making libraries available only to authorized users.

6) MPEG-2 compression requires very high speed computing for the cosine transforms and motion compensation. Autosophy compression does not require any computing. It is expected to be much less expensive to implement.

SUMMARY AND CONCLUSIONS

High speed digital communication networks, such as the Internet or ATM, are rapidly evolving into a world wide Information Superhighway. This technological revolution is driven mostly by the hardware which provides more and more bandwidth at lower and lower cost. A similar revolution is occurring in the communications software which provides better and better protocols and higher and higher data compression. Lossless data compression techniques are based on a new Autosophy information theory which may eventually replace the classical Shannon information theory. Autosophy based data compression is already implemented in V.42bis modems and in computer disc file compression. It is rapidly spreading into virtually all communications networks. Autosophy data and video communications are especially suitable for the new packet communications networks. In addition to high data and video compression, Autosophy transmissions may also provide a virtually unbreakable encryption code for secure communications via public networks.

The new Autosophy theories of “learning” are now evolving into a new generation of self-learning brain-like and no-programming autosopher which may eventually replace the now dominant programmed data processing computer. Such futuristic brain-like autosophy databases already exist in laboratory models that strikingly emulate learning in the human brain without programming or conventional data processing.

Autosophy theories may also explain the functioning of self-assembling structures, such as crystals, living trees or societies. It adds “meaning” and “information” as a new element in the cosmos. This research may have a very profound impact on other to natural sciences, such as physics, chemistry or biology.

REFERENCES

1 Klaus Holtz, “Data Compression in Network Design”. Tutorial at Design SuperCon’96, Santa Clara CA, Jan 1996

2 Klaus Holtz, “Digital Image and Video Compression for Packet Networks”. Tutorial at SuperCon’96, Santa Clara CA.

3 Klaus Holtz, “Autosophy Data Compression Accelerates and Encrypts Network Communications”. WESCON/95, Session C3, Paper 4. Nov. 7, 1995

4 Klaus Holtz, “Autosophy Image Compression for Packet Network Television” WESCON/95, Session C2, 1995

5 Klaus Holtz, “”Packet Video Transmission on the Information Superhighway Using Image Content Dependent Autosophy Video Compression”. IS&T’s 48th Annual Conference, Washington DC, May 1995

6 Klaus Holtz, “Fast Data Compression Speeds Data Communications”. Design SuperCon ‘95, Santa Clara CA, 1995

7 Klaus Holtz, “Lossless data-compression techniques advance beyond Shannon’s limits”. Personal Engineering Magazine, Dec. 1994

8 WESCON/94, Session W23, Advanced Information Management, Sept. 29, 1994, 5 papers, Anaheim CA.

9 WESCON/94, Session W9, Digital Video Compression. Sept. 28, 1994, Anaheim CA.

10 Colin Johnson, “Data know thyself” OEM Magazine, May 1994, Page 94

11 Klaus Holtz, “Hyperspace storage compression for Multimedia systems”, IS&T / SPIE Electronic Imaging Science and Technology, Paper 2188-40, Feb. 8, 1994, San Jose CA

12 WESCON/93, Session 7, Applications for lossless data compression. Sept. 28, 1993, San Francisco, CA. 4 papers.

13 Klaus Holtz, “Autosophy Networks yield Self-learning Robot Vision”. WESCON/93, Session S2, Paper 5, San Francisco, CA, Sept. 28, 1993

14 Klaus Holtz, “Self-aligning and compressed autosophy video databases”. SPIE Vol. 1908, 1993 San Jose, CA, Storage and Retrieval for Image and Video Databases.

15 Klaus Holtz, “Lossless Image Compression with Autosophy Networks”. SPIE Vol. 1903, 1993 San Jose, CA, Image and Video Processing

16 Klaus Holtz, “HDTV and Multimedia Image Compression with Autosophy Networks” WESCON/92, Nov. 1992, Anaheim, CA, page 414

17 Klaus Holtz, “Autosophy image compression and vision for aerospace sensing”. SPIE-92, Vol. 1700-39, Orlando, FL, April. 24, 1992

18 Colin Johnson, “Storage technique can learn”. Electronic Engineering Times, Jan.6,1992

19 NORTHCON-91, Session D4, Learning Networks: An Alternative to Data Processing. Portland OR, Oct.1991

20 Klaus Holtz, E, Holtz, “True Information Television (TITV) breaks Shannon Bandwidth Barrier”. IEEE Transaction on Consumer Electronics, May 1990, Volume 36, Number 2

21 Klaus Holtz, “Text Compression and Encryption with self-learning Networks”. IEEE GLOBECOM-85 New Orleans

22 Klaus Holtz. “Build a self-learning no programming Computer with your Microprocessor”. Dr. Dobbs Journal, Number 33, March 1979, Vol.4

23 Klaus Holtz, E. Langheld, “Der selbstlernende und programmier-freie Assoziationscomputer”. ELEKTRONIK magazin, Dec. 1978, Volume 14 and 15,Franzis Verlag Abt. Zeitschriften Vertrieb, Postfach 37 01 20, 8000 Munich 37, Germany

24 Klaus Holtz, “Here comes the brain-like self-learning no-programming computer of the future”. The First West Coast Computer Faire 1977, Faire, Box 1597, Palo Alto, CA 94302

25 D. A. Huffman, “A Method for the Construction of Minimum Redundancy Codes” Proceedings of the I.R.E. pp 1098-1101, Sept. 1952

26 C.E. Shannon, “A mathematical Theory of Communications”, Bell Telephone B-1598, Vol.27, July and Oct. 1948

27 J. Ziv and A. Lempel, “A Universal Algorithm for Sequential Data Compression”, IEEE Information Theory, IT-23, May 1977

28 J, Ziv and A. Lempel, “Compression of Individual Sequences via Variable-Rate Coding” IEEE Information Theory, IT-24, Sept. 1978

29 T. A. Welch, “A Technique for High Performance Data Compression” IEEE Computer, June 1984 (Patent 4,558,303)

30 Y. S. Miller, M. N. Wegman. “Variation on a Theme by Ziv and Lempel”IBM Papers, Combinatorial Algorithms on Words, 1985

31 Edward H. Sussenguth, “Use of Tree Structures for Processing Files”Communications of the ACM, Volume 6, Number 5, May 1963

Patents: 3,694,813. 3,914,747. 3,976,844. 4,021,782. 4,038,652. 4,054,951.
4,152,582. 4,366,551. 4,412,306. 4,436,422. 4,464,650. 4,558,302. 4,814,746.
4,870,415. 4,992,868. 5,003,307. 5,016,009. 5,051,745. 5,113,505.