English translation static texts

master
ana mertens 3 years ago
parent bfd71c36c6
commit d3e1e16961

@ -0,0 +1,52 @@
birches
firs
acacias
hollies
avocados
pears
poplars
apricots
camphors
corks
carobs
alders
almonds
maples
baobabs
ahoganies
chestnuts
cedars
cherries
cypresses
plums
coconuts
ebonies
oaks
eucalyptus
ashes
pomegranates
beeches
figs
lychees
lemons
mangos
apples
peaches
mulberries
oranges
walnuts
olives
elms
ombus
palmtrees
pines
planes
junipers
willows
sequoias
sycamores
teaks
yews
limes
wellingtonias
hornbeams

@ -0,0 +1,52 @@
birch
fir
acacia
holly
avocado
pear
poplar
apricot
camphor
cork
carob
alder
almond
maple
baobab
ahogany
chestnut
cedar
cherry
cypress
plum
coconut
ebony
oak
eucalyptus
ash
pomegranate
beech
fig
lychee
lemon
mango
apple
peach
mulberry
orange
walnut
olive
elm
ombu
palmtree
pine
plane
juniper
willow
sequoia
sycamore
teak
yew
lime
wellingtonia
hornbeam

@ -0,0 +1,359 @@
<!DOCTYPE html>
<html lang="en">
<head>
<style>
/* http://meyerweb.com/eric/tools/css/reset/
v2.0 | 20110126
License: none (public domain)
*/
html, body, div, span, applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
b, u, i, center,
dl, dt, dd, ol, ul, li,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td,
article, aside, canvas, details, embed,
figure, figcaption, footer, header, hgroup,
menu, nav, output, ruby, section, summary,
time, mark, audio, video {
margin: 0;
padding: 0;
border: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
/* HTML5 display-role reset for older browsers */
article, aside, details, figcaption, figure,
footer, header, hgroup, menu, nav, section {
display: block;
}
body {
line-height: 1;
}
ol, ul {
list-style: none;
}
blockquote, q {
quotes: none;
}
blockquote:before, blockquote:after,
q:before, q:after {
content: '';
content: none;
}
table {
border-collapse: collapse;
border-spacing: 0;
}
</style>
<style>
@font-face {
font-family: XanhMono;
src: url(static/fonts/XanhMono-Regular.woff2) format('woff2'),
url(static/fonts/XanhMono-Regular.woff) format('woff'),
url(static/fonts/XanhMono-Regular.ttf) format('truetype');
font-weight: 400;
font-style: normal;
}
@font-face {
font-family: XanhMono;
src: url(static/fonts/XanhMono-Italic.woff2) format('woff2'),
url(static/fonts/XanhMono-Italic.woff) format('woff'),
url(static/fonts/XanhMono-Italic.ttf) format('truetype');
font-weight: 400;
font-style: italic;
}
:root {
--font-size: 13pt;
--line-height: 17pt;
--font-size-smaller: 11pt;
--line-height--smaller: 15pt;
}
body {
font-family: XanhMono;
margin: 0;
font-size: var(--font-size);
line-height: var(--line-height);
display: flex;
flex-direction: row;
overflow: hidden;
max-height: 100vh;
}
h1 {
font-size: 24pt;
line-height: 29pt;
}
h2 {
font-size: 20pt;
line-height: 29pt;
}
a {
color: currentColor;
}
a:hover {
text-decoration: none;
}
blockquote {
font-style: italic;
}
p, form {
margin-top: var(--line-height);
}
footer {
font-style: normal;
margin-top: var(--line-height);
font-size: var(--font-size-smaller);
line-height: var(--line-height--smaller);
}
input, select, option, button {
font: inherit;
}
.panel {
padding: calc(2 * var(--line-height)) calc(2 * var(--line-height)) calc(2 * var(--line-height)) var(--line-height);
overflow: auto;
max-height: 100%;
}
.panel--original-fragment,
.panel--new-fragment {
flex: 1.5;
}
.panel--about {
flex: 1;
font-size: var(--font-size-smaller);
line-height: var(--line-height--smaller);
background: rgb(220, 236, 220);
min-height: calc(100vh - 4 * var(--line-height));
}
label {
margin-top: calc(0.25 * var(--line-height));
vertical-align: middle;
display: inline-block;
margin-right: 0.25em;
}
input[type="radio"] {
margin: 0 .25em 0 0;
vertical-align: middle;
}
button {
display: block;
margin-top: var(--line-height);
margin-bottom: var(--line-height);
}
#language_switcher {
position: fixed;
top: var(--line-height);
right: var(--line-height);
}
.panel--about[lang="es"] [lang="en"],
.panel--about[lang="en"] [lang="es"] {
display: none;
}
.message {
display: none;
font-size: var(--font-size-smaller);
line-height: var(--line-height--smaller);
font-style: italic;
margin: var(--line-height--smaller) 0;
}
.message[data-active] {
display: block;
}
@media screen and (max-width: 900px) {
:root {
--font-size: 11.5pt;
--line-height: 14.5pt;
--font-size-smaller: 10pt;
--line-height--smaller: 13pt;
}
h1 {
font-size: 20pt;
line-height: 25pt;
}
h2 {
font-size: 16pt;
line-height: 20pt;
}
body {
flex-direction: column;
overflow: initial;
}
.panel {
max-height: initial;
min-height: initial;
height: initial;
flex: 0 0 auto;
overflow: initial;
}
.panel--about {
order: -1;
}
}
</style>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Levenshtein Distancia reads Cortázar</title>
</head>
<body>
<section class="panel panel--original-fragment">
<h2>Fama and eucalyptus</h2>
<blockquote>
<p>
A fama is walking through a forest, and although he needs no wood he gazes greedily at the trees. The trees are terribly afraid because they are acquainted with the customs of the famas and anticipate the worst. Dead center of the wood there stands a handsome eucalyptus and the fama on seeing it gives a cry of happiness and dances respite and dances Catalan around the disturbed eucalyptus, talking like this:
</p>
<p>
— Antiseptic leaves, winter with health, great sanitation!
</p>
<p>
He fetches an axe and whacks the eucalyptus in the stomach. It doesnt bother the fama at all. The eucalyptus screams, wounded to death, and the other trees hear him say between sighs:
</p>
<p>
— To think that all this imbecile had to do was buy some Valda tablets.
</p>
<footer>Cronopios and Famas by Julio&nbsp;Cortázar, published in 1962, English edition, 1999, New Directions Classic</footer>
</blockquote>
</section>
<section class="panel panel--new-fragment">
<h2>To you</h2>
<p>
In order for Levenshtein's Distance to read the fragment and produce a unique book, you need to change one word.
</p>
<p>
You can choose another tree species for the eucalyptus, the main tree in the fragment. The Levenshtein Distance will then calculate which species is in its vicinity and the more generic word 'trees' will be replaced in the fragment by this new species.
</p>
<form action="{{ BASEURL }}/generate" method="POST">
<!-- <select name="selected_tree"> -->
{% for tree in trees %}
<label><input type="radio" name="selected_tree" value="{{ tree }}" />{{ tree }}</label>
{% endfor %}
<!-- </select> -->
<div class="message" data-message="working">Levenshtein Distance is busy writing, please wait a moment.</div>
<div class="message" data-message="error">An error occurred, please try again.</div>
<button type="submit">Generate</button>
</form>
</section>
<section class="panel panel--about" lang="es">
<nav id="language_switcher">
<a href="#" data-target-lang="es">ES</a>
<a href="#" data-target-lang="en">EN</a>
</nav>
<section lang="es">
<h1>La Distancia de Levenshtein lee a Cortázar</h1>
<p>'La Distancia de Levenshtein lee a Cortázar' es el quinto capítulo de <a href = "https://www.tabakalera.eus/es/agora-cemento-codigo">ÁGORA / CEMENTO / CÓDIGO</a>, una exposición online comisariada por <a href ="https://www.tabakalera.eus/es/lekutan">Lekutan</a>, dentro del programa Komisario Berriak, proyecto apoyado por <a href = "https://www.tabakalera.eus">Tabakalera</a> en Donostia / San Sebastián. <a href = "https://www.anaisberck.be">Anaïs Berck</a> presenta aquí una primera versión de un libro en la editorial 'Editorial Algoliteraria: crear alianzas con los árboles'. En esta editorial los autores son algoritmos y los libros presentan los resultados narrativos escritos desde su punto de vista.</p>
<p>El autor de este libro es el algoritmo <a href ="https://es.wikipedia.org/wiki/Distancia_de_Levenshtein">La Distancia de Levenhstein</a>, el tema es el eucalipto en "Fama y eucalipto", un fragmento de <a href ="https://es.wikipedia.org/wiki/Historias_de_cronopios_y_de_famas">Historias de Cronopios y de Famas</a> de <a href ="https://es.wikipedia.org/wiki/Julio_Cort%C3%A1zar">Julio Cortázar</a>.</p>
<p>El tiraje del libro es por definición infinito y cada copia será única.</p>
<p>La distancia de Levenshtein, distancia de edición o distancia entre palabras es un algoritmo que opera en los correctores ortográficos. Es el número mínimo de operaciones requeridas para transformar una palabra en otra. Una operación puede ser una inserción, eliminación o la sustitución de un carácter. El algoritmo fue una invención del científico ruso Vladimir Levenshtein en 1965.</p>
</section>
<section lang="en">
<!-- Engels -->
<h1>Levenshtein Distance reads Cortázar</h1>
<p>'Levenshtein Distance reads Cortázar' is the fifth chapter of <a href = "https://www.tabakalera.eus/es/agora-cemento-codigo">ÁGORA / CEMENTO / CÓDIGO</a>, an online exhibition curated by <a href ="https://www.tabakalera.eus/es/lekutan">Lekutan</a>, within the programme of Komisario Berriak supported by <a href = "https://www.tabakalera.eus">Tabakalera</a>. <a href = "https://www.anaisberck.be">Anaïs Berck</a> presents herewith a first version of a first book of the publishing house 'Algoliterary Publishing: making kin with trees'. In this publishing house the authors are algorithms, presented with their contexts and codes; and the books present the narrative point of view of the algorithm.</p>
<p>The author of this book is the algorithm <a href ="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenhstein Distance</a>, the subject is the eucalyptus tree in "Fama y eucalipto", an excerpt from Historias de Cronopios y de Famas by <a href ="https://en.wikipedia.org/wiki/Julio_Cort%C3%A1zar">Julio Cortázar</a>.</p>
<p>The printrun of the book is by definition infinite and each copy is unique.</p>
<p>Levenshtein distance, edit distance or word distance is an algorithm that operates in spell checkers. It is the minimum number of operations required to transform one word into another. An operation can be an insertion, deletion or substitution of a character. The algorithm was an invention of Russian scientist Vladimir Levenshtein in 1965.</p>
</section>
</section>
<script>
(function () {
var toggles = document.querySelectorAll('[data-target-lang]');
var panel = document.querySelector('.panel--about');
for (let i = 0; i < toggles.length; i++) {
toggles[i].addEventListener('click', function () {
panel.setAttribute('lang', this.dataset.targetLang)
});
}
})();
function getFilename (headers, fallback) {
if (headers.has('Content-Disposition')) {
header = headers.get('Content-Disposition')
matches = header.match(/filename="(.+)"/)
if (matches.length == 2) {
return matches[1]
}
}
return fallback;
}
(function () {
if (fetch) {
var form = document.querySelector('form'),
button = form.querySelector('button'),
messageWorking = document.querySelector('[data-message="working"]'),
messageError = document.querySelector('[data-message="error"]');
form.addEventListener('submit', function (e) {
e.preventDefault();
button.disabled = true;
delete messageError.dataset.active;
messageWorking.dataset.active = true;
const data = new FormData(form);
fetch(form.action, {
method: "POST",
body: data
}).then(function (r) {
console.log(r);
r.blob().then(function (blob) {
var filename = getFilename(r.headers, 'La distancia de Levenshtein.pdf');
const a = document.createElement('a');
a.setAttribute('href', URL.createObjectURL(blob));
a.setAttribute('download', filename);
if (document.createEvent) {
const event = document.createEvent('MouseEvents');
event.initEvent('click', true, true);
a.dispatchEvent(event);
}
else {
a.click();
}
delete messageWorking.dataset.active;
button.disabled = false;
});
}).catch(function() {
delete messageWorking.dataset.active;
messageError.dataset.active = true;
button.disabled = false;
});
});
}
})();
</script>
</body>
</html>

@ -0,0 +1,402 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Levenshtein Distance reads Cortázar {{ edition_count }}</title>
<style>
/* http://meyerweb.com/eric/tools/css/reset/
v2.0 | 20110126
License: none (public domain)
*/
html, body, div, span, applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
b, u, i, center,
dl, dt, dd, ol, ul, li,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td,
article, aside, canvas, details, embed,
figure, figcaption, footer, header, hgroup,
menu, nav, output, ruby, section, summary,
time, mark, audio, video {
margin: 0;
padding: 0;
border: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
/* HTML5 display-role reset for older browsers */
article, aside, details, figcaption, figure,
footer, header, hgroup, menu, nav, section {
display: block;
}
body {
line-height: 1;
}
ol, ul {
list-style: none;
}
blockquote, q {
quotes: none;
}
blockquote:before, blockquote:after,
q:before, q:after {
content: '';
content: none;
}
table {
border-collapse: collapse;
border-spacing: 0;
}
</style>
<style>
@font-face {
font-family: XanhMono;
src: url(file://{{ BASEDIR }}/static/fonts/XanhMono-Regular.woff2) format('woff2'),
url(file://{{ BASEDIR }}/static/fonts/XanhMono-Regular.woff) format('woff'),
url(file://{{ BASEDIR }}/static/fonts/XanhMono-Regular.ttf) format('truetype');
font-weight: 400;
font-style: normal;
}
@font-face {
font-family: XanhMono;
src: url(file://{{ BASEDIR }}/static/fonts/XanhMono-Italic.woff2) format('woff2'),
url(file://{{ BASEDIR }}/static/fonts/XanhMono-Italic.woff) format('woff'),
url(file://{{ BASEDIR }}/static/fonts/XanhMono-Italic.ttf) format('truetype');
font-weight: 400;
font-style: italic;
}
html, body {
font-family: XanhMono;
font-size: 8.15pt;
line-height: 12pt;
}
body {
margin-left: 9rem;
}
input, select, option, button {
font: inherit;
}
h1 {
font-size: 18pt;
line-height: 26pt;
margin-bottom: 12pt;
}
h2 {
font-size: 14pt;
line-height: 18pt;
break-before: page;
margin-bottom: 12pt;
}
h3 {
font-size: 10pt;
line-height: 12.5pt;
break-before: page;
margin-top: 6pt;
margin-bottom: 6pt;
}
h3.extra-space {
margin-top: 25pt;
}
h1 + h2,
h2 + h3,
.avoid-break {
break-before: avoid;
}
a {
color: currentColor;
}
p, ul, ol {
max-width: 40rem;
margin-bottom: 12pt;
}
h1, h2, h3 {
max-width: 35rem;
}
pre {
font-style: italic;
margin-left: -9rem;
margin-top: 12pt;
}
blockquote {
font-style: italic;
}
footer {
font-style: normal;
font-size: 7pt
}
pre.normal-flow {
margin-left: 0;
}
.two-col {
columns: 2;
margin-left: -8rem;
column-fill: auto;
}
.two-col pre {
margin-left: 0;
margin-top: 0;
orphans: 3;
widows: 3;
}
ul, ol {
margin-top: 12pt;
}
li {
position: relative;
}
ol {
counter-reset: list-counter 0;
}
ol li {
counter-increment: list-counter 1;
}
ol > li {
margin-bottom: 12pt;
}
ol li:before {
position: absolute;
left: -1.15em;
content: counter(list-counter) '.';
}
ol ol {
margin-left: 2em;
counter-reset: sublist-counter;
}
ol ol li {
counter-increment: sublist-counter 1;
}
ol ol li:before {
left: -2.3em;
content: counter(list-counter) '.' counter(sublist-counter);
}
ul li:before {
content: '';
position: absolute;
left: -1em;
}
@page {
margin: 15mm 15mm;
}
@page:left {
@bottom-left {
content: counter(page);
}
}
@page:right {
@bottom-right {
content: counter(page);
}
}
@page:first {
@bottom-right {
content: '';
}
@bottom-left {
content: '';
}
}
</style>
</head>
<body>
<!--title -->
<pre>
{{ fragment_cover_map }}
</pre>
<h1>Levenshtein Distance<br>reads Cortázar</h1>
<p>
Generated on {{ date }} at {{ time }}, N⁰ {{ edition_count}}
</p>
<!--index -->
<h2>Index</h2>
<ol>
<li>Introduction</li>
<li>Reading Cortázar
<ol>
<li>Original fragment</li>
<li>Adapted fragment</li>
<li>Map of the woods</li>
<li>Table with new intermediary species</li>
<li>Repetitive poetry</li>
</ol>
</li>
<li>General description of the Levenshtein Distance</li>
<li>Technical description of the Levenshtein Distance</li>
<li>Code</li>
<li>Credits</li>
</ol>
<!--introduction -->
<h2>1. Introduction</h2>
<p>Levenshtein Distance reads Cortázar is the first version of the first book in the 'Algoliterary Publishing House: making kin with trees'.</p>
<p>The author of this book is the algorithm <a href ="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenhstein Distance</a>, the subject is the eucalyptus in "Fama and eucalyptus", a fragment of <a href ="https://es.wikipedia.org/wiki/Historias_de_cronopios_y_de_famas">Cronopios and Famas</a> by <a href ="https://en.wikipedia.org/wiki/Julio_Cort%C3%A1zar">Julio Cortázar</a>.</p>
<p>The versions of the book are infinite by definition and each copy is unique.</p>
<p><a href = "https://www.anaisberck.be">Anaïs Berck</a> is a pseudonym and represents a collaboration between humans, algorithms and trees. <a href = "https://www.anaisberck.be">Anaïs Berck</a> explores the specificities of human intelligence in the company of artificial and plant intelligences. In June 2021, during a residency at Medialab Prado in Madrid, Anaïs Berck will develop a prototype of an Algoliterary Publishing House, in which algorithms are the authors of unusual books. The residency was granted by the "Residency Digital Culture" programme initiated by the Flemish Government.</p>
<p>In this work Anaïs Berck is represented by:</p>
<ul>
<li>the algorithm <a href ="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenhstein Distance</a> of which you find a description in this book,</li>
<li>el eucalipto en <a href ="https://es.wikipedia.org/wiki/Historias_de_cronopios_y_de_famas">Cronopios and Famas</a> by <a href ="https://en.wikipedia.org/wiki/Julio_Cort%C3%A1zar">Julio Cortázar</a>, published in 1962 by Editorial Minotauro, English edition, 1999, New Directions Classic,</li>
<li>the human beings An Mertens and Gijs de Heij. An has published several books, as a fiction writer and as an artist and researcher at <a href="https://constantvzw.org/">Constant</a>, an organisation for experimental art and media in Brussels of which she has been a member since 2008. Gijs is a programmer and designer, part of <a href="http://osp.kitchen/">Open Source Publishing</a>, a collective of designers in Brussels. Both are members of <a href="https://algolit.net/">Algolit</a>, an artistic experimentation group in Brussels around algorithms and free texts.</li>
</ul>
<!--Reading Cortazar -->
<h2>2. Reading Cortázar</h2>
<!--Reading Cortazar -original text -->
<h3>2.1. Original fragment</h3>
<blockquote>
<p>
A fama is walking through a forest, and although he needs no wood he gazes greedily at the trees. The trees are terribly afraid because they are acquainted with the customs of the famas and anticipate the worst. Dead center of the wood there stands a handsome eucalyptus and the fama on seeing it gives a cry of happiness and dances respite and dances Catalan around the disturbed eucalyptus, talking like this:
</p>
<p>
— Antiseptic leaves, winter with health, great sanitation!
</p>
<p>
He fetches an axe and whacks the eucalyptus in the stomach. It doesnt bother the fama at all. The eucalyptus screams, wounded to death, and the other trees hear him say between sighs:
</p>
<p>
— To think that all this imbecile had to do was buy some Valda tablets.
</p>
<footer>Cronopios and Famas by Julio Cortázar, by <a href ="https://en.wikipedia.org/wiki/Julio_Cort%C3%A1zar">Julio Cortázar</a>, published in 1962 by Editorial Minotauro, English edition, 1999, New Directions Classic.</footer>
</blockquote>
<!--Reading Cortazar - rewritten text -->
<h3 class="avoid-break extra-space">2.2. Adapted fragment</h3>
<!--OUTPUT SCRIPT -->
<pre class="normal-flow">{{ new_fragment }}</pre>
<!--Reading Cortazar - map of the woods -->
<h3>2.3. Map of the woods</h3>
<p>The distances between the main tree you have chosen for your fragment and the areas of other tree species in the woods, according to Levenshtein Distance:</p>
<!--OUTPUT SCRIPT -->
<pre>
{{ forest_map }}</pre>
<!--Reading Cortazar - table of intermediary species-->
<h3>2.4. Table of intermediary species</h3>
<p>The Levenshtein Distance creates a table with the two species. In this table it calculates for each cell the distance between the distinct elements of the two words.</p>
<p>The table is filled with numbers that represent the operations necessary to change one element to another. The possible operations are inserting, deleting or substituting a letter. Instead of numbers, this table is filled with the various intermediary species that the algorithm creates by inserting, deleting or substituting letters.</p>
<!--OUTPUT SCRIPT -->
<pre>
{{ table_of_intermediary_species }}</pre>
<!--Reading Cortazar - repetitive poetry -->
<h3>2.5. Repetitive poetry</h3>
<!--OUTPUT SCRIPT -->
<div class="two-col">
<pre>{{ repetitive_poetry }}</pre>
</div>
<!--General description algorithm-->
<h2>3. General description of the Levenshtein Distance</h2>
<p>Levenshtein Distance is an algorithm that measures the difference between two words or two groups of letters. It is also called the 'edit distance'. The Levenshtein Distance between two words is the minimum number of actions needed to change one word to another. The different possible actions are insertion, deletion or substitution of a single letter. For example, the Levenshtein Distance between 'end' and 'and' is 1, as 'e' is replaced by 'a'.</p>
<p>The algorithm was named after its creator, Vladimir Levenshtein, a Russian mathematician and scientist of Jewish origin whose main area of research was information theory and error-correcting codes. He worked at the Kéldysh Institute of Applied Mathematics in Moscow. He passed away in 2017 at the age of 82. He launched the algorithm in 1965 'to consider the problem of constructing optimal codes capable of correcting deletions, insertions and inversions'.</p>
<p>The Levenshtein Distance operates in software such as spell checkers and consequently in computer-assisted translation programs. The Levenshtein Distance can also be found in search engines where it detects the words most similar to the wrongly entered word.</p>
<p>Its activity extends to less obvious fields such as plagiarism detection, DNA analysis, automatic voice recognition, optical character recognition in scanned text analysis (OCR), handwriting recognition, hoax email detection or stock market sales and purchase assistance.</p>
<p>Sometimes the Levenshtein Distance leads to surprising discoveries. In 1995, for example, Kessler applied the algorithm to the comparison of Irish dialects. He showed that it was a successful method for measuring phonetic distances between dialects. From the linguistic distances between dialectal varieties, dialectal areas can be found. More innovative was the possibility to draw dialect maps reflecting the fact that dialect areas should be considered as continuous and not as areas separated by sharp boundaries.</p>
<p>Sources:</p>
<ul>
<li>Vladimir Levenshtein, <a href="https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf">Binary codes capable of correcting deletions, insertions, and reversals</a>; Cybernetics and Control Theory, vol. 10 nr. 8, February 1966.</li>
<li><a href = "https://en.wikipedia.org/wiki/Vladimir_Levenshtein">Vladimir Levenhstein</a> + <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein Distance</a> on Wikipedia.</li>
<li>RYBN, ADMXI, <a href = "http://www.rybn.org/ANTI/ADMXI/documentation/ALGORITHM_DOCUMENTATION/HARMONY_OF_THE_SPEARS/LEVENSHTEIN_EDIT_DISTANCE/ABOUT/Wikipedia_Levenshtein_Edit_Distance.pdf">Levenshtein Edit Distance</a>.</li>
<li>Abhi Dattasharma, Praveen Kumar Tripathi and Sridhar G, <a href="http://www.rybn.org/ANTI/ADMXI/documentation/ALGORITHM_DOCUMENTATION/HARMONY_OF_THE_SPEARS/LEVENSHTEIN_EDIT_DISTANCE/FINANCIAL_USES/2008_Identifying_Stock_Similarity_Based_on_Multi-event_Episodes.pdf">Identifying Stock Similarity Based on Multi-event Episodes</a>, in Seventh Australasian Data Mining Conference, 2008, Glenelg, Australia.</li>
<li>S. Dutta Chowdhury; U. Bhattacharya; S.K. Parui, Online Handwriting Recognition Using Levenshtein Distance Metric, in 12th International Conference on Document Analysis and Recognition, 2013, USA.</li>
<li>Yoke Yie Chen, Suet-Peng Yong, and Adzlan Ishak, Email Hoax Detection System Using Levenshtein Distance Method, in Journal of Computers, vol. 9, nr 2, February 2014.</li>
<li>Charlotte Gooskens and Wilbert Heeringa, <a href =" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.414.9927&rep=rep1&type=pdf">Perceptive evaluation of Levenshtein dialect distancemeasurements using Norwegian dialect data</a>, in Language Variation and Change, nr.16, 2004, 189207. Cambridge University Press.</li>
</ul>
<!--Technical description algorithm-->
<h2>4. Technical description of the Levenshtein Distance</h2>
<p>Humans can rewrite a word and easily count the number of changes that are necessary to transform one word into another. We invite you to write the word 'machine' on a sheet of paper, followed by the word 'human' on the next line. Knowing that you can only insert, delete or replace one letter, how many operations would you need to do to rewrite the word 'machine' into 'human'?</p>
<p>To give you an idea of how the Levenshtein Distance algorithm works, we describe here the different steps the algorithm takes to transform the word machine into human.</p>
<p>For the word 'machine', the algorithm first analyses the possible elements. They are m, ma, mac, mach, machi, machin and machine for a total of seven elements. For the word human, the elements are h, hu, hum, huma and human for a total of five elements. This creates a matrix with 7 rows and 5 columns. In this distance matrix it will calculate for each cell the distance between the elements of the two words.</p>
<p>It starts with the first element of the word machine which is m, compares it with the five elements of the word human. The first one will be h. What is the Levenshtein distance between m and h? What it has to do is to replace the character m by h, thus the distance is 1.</p>
<p>Then it moves on to the next element of the word human which is hu. What is the Levenshtein distance between m and hu? As m contains only one character and hu contains more than one character, you can be 100% sure that you have to insert a new character. To transform m into hu, first the character m is replaced by h, and then u is added. To transform m into hu, the distance is 2.</p>
<p>Now it moves on to the third element. What is the distance between m and hum? It does the same as above, it replaces the character m by h and adds 2 more characters. The final distance is 3.</p>
<p>It continues until it has calculated the distance between the first element of the word machine, or m, and the 5 elements of the second word human. The distances are simply 1, 2, 3 and 4; they simply increase by 1.</p>
<p>After calculating the distances between the first element of the first word and all the elements of the second word, the process continues by calculating the distances between the remaining elements of the first word and the elements of the second word.</p>
<p>The process continues with ma. It compares ma to the five elements of the word human. The first one will be h. What is the Levenshtein distance between ma and h? What it has to do is replace the character m with h and delete the character a. Thus the distance is 2. It moves on to the next element of the word human which is hu. What is the Levenshtein distance between ma and hu? It replaces the character m with h and the letter a with u. The distance is 2.</p>
<p>That is how the table is filled up.</p>
<p>In code terms one could speak of an optimising effect on the table.</p>
<p>The value is calculated based on the three nearest digits of the cell in the table corresponding to the characters being compared: horizontal, vertical, diagonal.</p>
<p>If the letters are the same, the lowest value of the three is chosen.</p>
<p>If the letters are different, the lowest value of the three is chosen and 1 is added.</p>
<p>The last value in the table of counts is the minimum distance between the 2 words.</p>
<p>In the table it is the value situated in the lower right corner.</p>
<p>Ultimately it is a matter of tracing the shortest path in the transformations from one word to the other:</p>
<ul>
<li>the diagonal path dealing with different characters represents substitution</li>
<li>the diagonal path dealing with similar characters does not represent any change</li>
<li>the path towards the left represents an insertion</li>
</ul>
<p>Source: <a href = "https://blog.paperspace.com/measuring-text-similarity-using-levenshtein-distance/">Blog Paperspace</a></p>
<!--Code-->
<h2>5. Código</h2>
<!--OUTPUT SCRIPT -->
{% for path, source in sources %}
<h3>{{ path }}</h3>
<pre>{{ source }}</pre>
{% endfor %}
<!--Credits-->
<h2>6. Credits</h2>
<p>This book is a creation by <a href = "https://www.anaisberck.be/">Anaïs Berck</a> for <a href = "https://www.tabakalera.eus/es/agora-cemento-codigo">ÁGORA / CEMENTO / CÓDIGO</a>, a project of <a href ="https://www.tabakalera.eus/es/lekutan">Asociación Cultural LEKUTAN</a> in the <a href = "https://www.tabakalera.eus">International Centre for Contemporary Culture Tabakalera</a>, Donostia / San Sebastián.</p>
<p>The copy of this book is unique and the print run is infinite by definition.</p>
<!--OUTPUT SCRIPT -->
<p>This copy is number {{ edition_count }} of all downloaded copies.</p>
<p>Collective conditions of (re)use (CC4r), 2021</p>
<p>Copyleft with a difference: You are invited to copy, distribute, and modify this work under the terms of the CC4r: <a href = "https://gitlab.constantvzw.org/unbound/cc4r">https://gitlab.constantvzw.org/unbound/cc4r</a></p>
</body>
</html>
Loading…
Cancel
Save