diff --git a/arboles_plural_en.txt b/arboles_plural_en.txt new file mode 100644 index 0000000..3a47be9 --- /dev/null +++ b/arboles_plural_en.txt @@ -0,0 +1,52 @@ +birches +firs +acacias +hollies +avocados +pears +poplars +apricots +camphors +corks +carobs +alders +almonds +maples +baobabs +ahoganies +chestnuts +cedars +cherries +cypresses +plums +coconuts +ebonies +oaks +eucalyptus +ashes +pomegranates +beeches +figs +lychees +lemons +mangos +apples +peaches +mulberries +oranges +walnuts +olives +elms +ombus +palmtrees +pines +planes +junipers +willows +sequoias +sycamores +teaks +yews +limes +wellingtonias +hornbeams diff --git a/arboles_simple_en.txt b/arboles_simple_en.txt new file mode 100644 index 0000000..3dad3b8 --- /dev/null +++ b/arboles_simple_en.txt @@ -0,0 +1,52 @@ +birch +fir +acacia +holly +avocado +pear +poplar +apricot +camphor +cork +carob +alder +almond +maple +baobab +ahogany +chestnut +cedar +cherry +cypress +plum +coconut +ebony +oak +eucalyptus +ash +pomegranate +beech +fig +lychee +lemon +mango +apple +peach +mulberry +orange +walnut +olive +elm +ombu +palmtree +pine +plane +juniper +willow +sequoia +sycamore +teak +yew +lime +wellingtonia +hornbeam diff --git a/templates/index_en.html b/templates/index_en.html new file mode 100644 index 0000000..b4f883d --- /dev/null +++ b/templates/index_en.html @@ -0,0 +1,359 @@ + + + + + + + + Levenshtein Distancia reads Cortázar + + +
+

Fama and eucalyptus

+
+

+ A fama is walking through a forest, and although he needs no wood he gazes greedily at the trees. The trees are terribly afraid because they are acquainted with the customs of the famas and anticipate the worst. Dead center of the wood there stands a handsome eucalyptus and the fama on seeing it gives a cry of happiness and dances respite and dances Catalan around the disturbed eucalyptus, talking like this: +

+

+ — Antiseptic leaves, winter with health, great sanitation! +

+

+ He fetches an axe and whacks the eucalyptus in the stomach. It doesn’t bother the fama at all. The eucalyptus screams, wounded to death, and the other trees hear him say between sighs: +

+

+ — To think that all this imbecile had to do was buy some Valda tablets. +

+ +
+
+
+

To you

+

+ In order for Levenshtein's Distance to read the fragment and produce a unique book, you need to change one word. +

+

+ You can choose another tree species for the eucalyptus, the main tree in the fragment. The Levenshtein Distance will then calculate which species is in its vicinity and the more generic word 'trees' will be replaced in the fragment by this new species. +

+
+ + {% for tree in trees %} + + {% endfor %} + +
Levenshtein Distance is busy writing, please wait a moment.
+
An error occurred, please try again.
+ +
+
+
+ +
+

La Distancia de Levenshtein lee a Cortázar

+

'La Distancia de Levenshtein lee a Cortázar' es el quinto capítulo de ÁGORA / CEMENTO / CÓDIGO, una exposición online comisariada por Lekutan, dentro del programa Komisario Berriak, proyecto apoyado por Tabakalera en Donostia / San Sebastián. Anaïs Berck presenta aquí una primera versión de un libro en la editorial 'Editorial Algoliteraria: crear alianzas con los árboles'. En esta editorial los autores son algoritmos y los libros presentan los resultados narrativos escritos desde su punto de vista.

+

El autor de este libro es el algoritmo La Distancia de Levenhstein, el tema es el eucalipto en "Fama y eucalipto", un fragmento de Historias de Cronopios y de Famas de Julio Cortázar.

+

El tiraje del libro es por definición infinito y cada copia será única.

+

La distancia de Levenshtein, distancia de edición o distancia entre palabras es un algoritmo que opera en los correctores ortográficos. Es el número mínimo de operaciones requeridas para transformar una palabra en otra. Una operación puede ser una inserción, eliminación o la sustitución de un carácter. El algoritmo fue una invención del científico ruso Vladimir Levenshtein en 1965.

+
+
+ +

Levenshtein Distance reads Cortázar

+

'Levenshtein Distance reads Cortázar' is the fifth chapter of ÁGORA / CEMENTO / CÓDIGO, an online exhibition curated by Lekutan, within the programme of Komisario Berriak supported by Tabakalera. Anaïs Berck presents herewith a first version of a first book of the publishing house 'Algoliterary Publishing: making kin with trees'. In this publishing house the authors are algorithms, presented with their contexts and codes; and the books present the narrative point of view of the algorithm.

+

The author of this book is the algorithm Levenhstein Distance, the subject is the eucalyptus tree in "Fama y eucalipto", an excerpt from Historias de Cronopios y de Famas by Julio Cortázar.

+

The printrun of the book is by definition infinite and each copy is unique.

+

Levenshtein distance, edit distance or word distance is an algorithm that operates in spell checkers. It is the minimum number of operations required to transform one word into another. An operation can be an insertion, deletion or substitution of a character. The algorithm was an invention of Russian scientist Vladimir Levenshtein in 1965.

+
+
+ + + \ No newline at end of file diff --git a/templates/print_en.html b/templates/print_en.html new file mode 100644 index 0000000..278c233 --- /dev/null +++ b/templates/print_en.html @@ -0,0 +1,402 @@ + + + + + + Levenshtein Distance reads Cortázar {{ edition_count }} + + + + + +
+{{ fragment_cover_map }}
+    
+    
+    
+    
+    
+    
+

Levenshtein Distance
reads Cortázar

+

+ Generated on {{ date }} at {{ time }}, N⁰ {{ edition_count}} +

+ +

Index

+
    +
  1. Introduction
  2. +
  3. Reading Cortázar +
      +
    1. Original fragment
    2. +
    3. Adapted fragment
    4. +
    5. Map of the woods
    6. +
    7. Table with new intermediary species
    8. +
    9. Repetitive poetry
    10. +
    +
  4. +
  5. General description of the Levenshtein Distance
  6. +
  7. Technical description of the Levenshtein Distance
  8. +
  9. Code
  10. +
  11. Credits
  12. +
+ + +

1. Introduction

+

Levenshtein Distance reads Cortázar is the first version of the first book in the 'Algoliterary Publishing House: making kin with trees'.

+

The author of this book is the algorithm Levenhstein Distance, the subject is the eucalyptus in "Fama and eucalyptus", a fragment of Cronopios and Famas by Julio Cortázar.

+

The versions of the book are infinite by definition and each copy is unique.

+

Anaïs Berck is a pseudonym and represents a collaboration between humans, algorithms and trees. Anaïs Berck explores the specificities of human intelligence in the company of artificial and plant intelligences. In June 2021, during a residency at Medialab Prado in Madrid, Anaïs Berck will develop a prototype of an Algoliterary Publishing House, in which algorithms are the authors of unusual books. The residency was granted by the "Residency Digital Culture" programme initiated by the Flemish Government.

+

In this work Anaïs Berck is represented by:

+ + + +

2. Reading Cortázar

+ +

2.1. Original fragment

+
+

+ A fama is walking through a forest, and although he needs no wood he gazes greedily at the trees. The trees are terribly afraid because they are acquainted with the customs of the famas and anticipate the worst. Dead center of the wood there stands a handsome eucalyptus and the fama on seeing it gives a cry of happiness and dances respite and dances Catalan around the disturbed eucalyptus, talking like this: +

+

+ — Antiseptic leaves, winter with health, great sanitation! +

+

+ He fetches an axe and whacks the eucalyptus in the stomach. It doesn’t bother the fama at all. The eucalyptus screams, wounded to death, and the other trees hear him say between sighs: +

+

+ — To think that all this imbecile had to do was buy some Valda tablets. +

+ +
+ + +

2.2. Adapted fragment

+ +
{{ new_fragment }}
+ + +

2.3. Map of the woods

+

The distances between the main tree you have chosen for your fragment and the areas of other tree species in the woods, according to Levenshtein Distance:

+ +
+
+
+
+{{ forest_map }}
+ + +

2.4. Table of intermediary species

+

The Levenshtein Distance creates a table with the two species. In this table it calculates for each cell the distance between the distinct elements of the two words.

+

The table is filled with numbers that represent the operations necessary to change one element to another. The possible operations are inserting, deleting or substituting a letter. Instead of numbers, this table is filled with the various intermediary species that the algorithm creates by inserting, deleting or substituting letters.

+ +
+
+
+{{ table_of_intermediary_species }}
+ + +

2.5. Repetitive poetry

+ +
+
{{ repetitive_poetry }}
+
+ + +

3. General description of the Levenshtein Distance

+

Levenshtein Distance is an algorithm that measures the difference between two words or two groups of letters. It is also called the 'edit distance'. The Levenshtein Distance between two words is the minimum number of actions needed to change one word to another. The different possible actions are insertion, deletion or substitution of a single letter. For example, the Levenshtein Distance between 'end' and 'and' is 1, as 'e' is replaced by 'a'.

+

The algorithm was named after its creator, Vladimir Levenshtein, a Russian mathematician and scientist of Jewish origin whose main area of research was information theory and error-correcting codes. He worked at the Kéldysh Institute of Applied Mathematics in Moscow. He passed away in 2017 at the age of 82. He launched the algorithm in 1965 'to consider the problem of constructing optimal codes capable of correcting deletions, insertions and inversions'.

+

The Levenshtein Distance operates in software such as spell checkers and consequently in computer-assisted translation programs. The Levenshtein Distance can also be found in search engines where it detects the words most similar to the wrongly entered word.

+

Its activity extends to less obvious fields such as plagiarism detection, DNA analysis, automatic voice recognition, optical character recognition in scanned text analysis (OCR), handwriting recognition, hoax email detection or stock market sales and purchase assistance.

+

Sometimes the Levenshtein Distance leads to surprising discoveries. In 1995, for example, Kessler applied the algorithm to the comparison of Irish dialects. He showed that it was a successful method for measuring phonetic distances between dialects. From the linguistic distances between dialectal varieties, dialectal areas can be found. More innovative was the possibility to draw dialect maps reflecting the fact that dialect areas should be considered as continuous and not as areas separated by sharp boundaries.

+

Sources:

+ + + + +

4. Technical description of the Levenshtein Distance

+

Humans can rewrite a word and easily count the number of changes that are necessary to transform one word into another. We invite you to write the word 'machine' on a sheet of paper, followed by the word 'human' on the next line. Knowing that you can only insert, delete or replace one letter, how many operations would you need to do to rewrite the word 'machine' into 'human'?

+

To give you an idea of how the Levenshtein Distance algorithm works, we describe here the different steps the algorithm takes to transform the word machine into human.

+

For the word 'machine', the algorithm first analyses the possible elements. They are m, ma, mac, mach, machi, machin and machine for a total of seven elements. For the word human, the elements are h, hu, hum, huma and human for a total of five elements. This creates a matrix with 7 rows and 5 columns. In this distance matrix it will calculate for each cell the distance between the elements of the two words.

+

It starts with the first element of the word machine which is m, compares it with the five elements of the word human. The first one will be h. What is the Levenshtein distance between m and h? What it has to do is to replace the character m by h, thus the distance is 1.

+

Then it moves on to the next element of the word human which is hu. What is the Levenshtein distance between m and hu? As m contains only one character and hu contains more than one character, you can be 100% sure that you have to insert a new character. To transform m into hu, first the character m is replaced by h, and then u is added. To transform m into hu, the distance is 2.

+

Now it moves on to the third element. What is the distance between m and hum? It does the same as above, it replaces the character m by h and adds 2 more characters. The final distance is 3.

+

It continues until it has calculated the distance between the first element of the word machine, or m, and the 5 elements of the second word human. The distances are simply 1, 2, 3 and 4; they simply increase by 1.

+

After calculating the distances between the first element of the first word and all the elements of the second word, the process continues by calculating the distances between the remaining elements of the first word and the elements of the second word.

+

The process continues with ma. It compares ma to the five elements of the word human. The first one will be h. What is the Levenshtein distance between ma and h? What it has to do is replace the character m with h and delete the character a. Thus the distance is 2. It moves on to the next element of the word human which is hu. What is the Levenshtein distance between ma and hu? It replaces the character m with h and the letter a with u. The distance is 2.

+

That is how the table is filled up.

+

In code terms one could speak of an optimising effect on the table.

+

The value is calculated based on the three nearest digits of the cell in the table corresponding to the characters being compared: horizontal, vertical, diagonal.

+

If the letters are the same, the lowest value of the three is chosen.

+

If the letters are different, the lowest value of the three is chosen and 1 is added.

+

The last value in the table of counts is the minimum distance between the 2 words.

+

In the table it is the value situated in the lower right corner.

+

Ultimately it is a matter of tracing the shortest path in the transformations from one word to the other:

+ +

Source: Blog Paperspace

+ + +

5. Código

+ + {% for path, source in sources %} +

{{ path }}

+
{{ source }}
+ {% endfor %} + + +

6. Credits

+

This book is a creation by Anaïs Berck for ÁGORA / CEMENTO / CÓDIGO, a project of Asociación Cultural LEKUTAN in the International Centre for Contemporary Culture Tabakalera, Donostia / San Sebastián.

+

The copy of this book is unique and the print run is infinite by definition.

+ + +

This copy is number {{ edition_count }} of all downloaded copies.

+

Collective conditions of (re)use (CC4r), 2021

+

Copyleft with a difference: You are invited to copy, distribute, and modify this work under the terms of the CC4r: https://gitlab.constantvzw.org/unbound/cc4r

+ + + \ No newline at end of file