Louis van Beurden's blog

La gestion des déchets dans la ville intelligente

Avec la multiplication des objets connectés, les progrès de la catégorisation automatique, de meilleurs systèmes de métadonnées et l’hégémonie émergente des états-plateformes, on peut entrevoir une ville qui fonctionne dans une plus grande intelligence collective. Dans ce billet de blog, on étudie le cas du traitement des déchets

Le contrôle et la transparence des chaines d’approvisionnement est un enjeux majeur aujourd’hui. Sa contrepartie, les chaines de traitements des déchets devrait bénéficier aussi d’une plus grande transparence, par exemple pour augmenter nos capacités d’innovation dans les techniques de recyclage.
Le coût réel du déchet est difficile à estimer par le citoyen. Est-ce que mon déchet est bonifié dans une autre chaîne de valeurs ? Ou est-il transporté vers une déchetterie à plusieurs milliers de kilomètres ? Quel est son impact écologique ? Les réponses à ces questions sont des signaux indispensables pour que les citoyens puissent se forger une conscience écologique collective.

IOT au service des déchets

La ville connectée se munit d’une infrastructure de communication sans précédent dans l’histoire de l’urbanisme. Un réseau de petits ordinateurs munis de capteurs et d’effecteurs, communiquent, s’échangent des données et agissent en coordination.

Les objets du quotidien en rapport avec le traitement des déchets peuvent s’intégrer dans cette nouvelle danse:

les poubelles connectées indiquent leur niveau de remplissage, leurs types de déchets, et les informations spaciotemporelles d’accès. Ces informations sont utilisées par les sociétés de collecte des déchets pour coordonnées les rondes.
les caméras de surveillance peuvent être utilisées pour reconnaître les déchets qui jonchent le sol, afin de produire une carte des densités de saletés.
L’écosystème autoréférenciel engendré par la foule de machines, peut permettre de localiser les déchets dans les photos que les citoyens postent sur les réseaux sociaux.
Il faudrait mettre en place des entrepôts de tris des déchets automatisés, en fonction du besoin des entreprises. Certains déchets ont encore beaucoup de valeur pour certains acteurs, mais le coût de la chasse au trésor dans les décharges est souvent trop important.

Afin que l’insertion des nouvelles machines dans cet écosystème se passe le plus efficacement possible, de la poubelle personnelle de la résidence, de la benne à ordure, en passant par le camion de ramassage, et du dépotoir jusqu’à l’usine de traitement, tous doivent exposer une interface commune permettant l’interrogation en temps réel.

Les informations produites par ces objets permettent d’optimiser les trajets des agents d’entretien. Des zones de ramassage prioritaires peuvent être calculées et des modèles statistiques peuvent être appris sur ces capteurs. On intégrera ces systèmes avec les autres services de la ville intelligente, par exemple, on priorisera le ramassage autour du stade après un événement sportif.

Cependant, un risque majeur menace cet écosystème, et c’est le même qui atrophie le web sémantique : l’incompatibilité sémantique entre les interfaces des machines. De plus, le déchet évolue avec notre mode de vie, et également nos techniques de tri et de recyclage, nos descriptions doivent suivre ces changements. C’est pourquoi il est indispensable de se doter d’un système de description (de métadonnées) suffisamment flexible pour qu’il puisse s’adapter aux changements.

Une interface commune machine-machine et machine-humain

Définir le déchet n’est pas une tâche aisée : il évolue avec nos pratiques et nos modes de consommation. La perception automatisée de la ville intelligente, c’est-à-dire l’ensemble des activations des capteurs avec leurs catégories, doit pouvoir changer selon nos besoins.

Il est donc indispensable de se munir d’une technologie sémantique qui soit :

Suffisamment puissante pour représenter l’ensemble de la chaîne de traitement des déchets.
Collaborativement évolutive, afin de rester à jour avec les évolutions du monde.
Permettre les ambiguïtés : les nouvelles idées ne sont pas toujours bien comprises.
Soit compréhensible par tout le monde sans effort.
Doit présenter ses facettes pertinentes en fonction du consommateur de la métadonnée.

Voici un exemple de types de métadonnées pertinentes pour une poubelle intelligente, du point de vue du service de ramassage :

niveau de remplissage
type de déchets
propriétaire
historique de ramassage
information spatiotemporelle sur l’accès à l’objet
spécification technique de la poubelle

Ces métadonnées doivent être facilement consultés par ceux qui en ont le droit, et ne doivent pas être isolés dans une unique plateforme. Néanmoins, le processus de lecture et d’écriture de ces métadonnées doit se faire sur les médiums de communications les plus évidents pour chaque acteur (ex: citoyen-consommateur: réseaux sociaux, service de coordination du ramassage: API rest etc… ).

Une catégorisation précise et évolutive permettra de décrire l’ensemble de la chaîne de traitement des déchets. Les définitions des objets et des processus dans un tel langage, par exemple IEML, permettent aussi d’ouvrir de nouveaux marchés : ces différents services peuvent s’expliciter dans des contrats intelligents. En effet, le ramassage, le tri, l’entreposage, le nettoyage des espaces publiques, le recyclage, l’incinération ou l’exportation doivent être des opérations transparentes, et si possible, doivent pouvoir se “libéraliser” par le déploiement d’un registre de contrats intelligents, et devrait, in fine déboucher sur un marché du déchet.

Grâce à cette infrastructure, on peut alors estimer le coût réel du déchet. Ce dernier doit être le plus transparent possible pour le citoyen, afin qu’il puisse connaitre l’impact de ses actes.

Les réseaux sociaux comme plateforme d’échange privilégiée

Les réseaux sociaux sont la nouvelle place publique. Le citoyen doit pouvoir interagir avec l’ensemble de la chaîne de traitement du déchet par des interactions sociales sur les réseaux. En particulier, il doit pouvoir:

Contacter et interagir avec l’ensemble des acteurs de la chaîne de traitement. Les acteurs peuvent être des responsables humains ou des machines.
Demander des informations personnelles ou publiques spécifiques (rapport automatique), déclencher des traitements administratifs (ramassage non-conventionnel, problème de matériel, changement d’adresse etc…) ainsi que faire des requêtes spécifiques exceptionnelles.

Enfin, nos moyens de communication conditionnent la taille de nos sociétés, et le médium algorithmique permet, par la conversation stimergique, de proposer, comparer et d’adopter collectivement des représentations durables à une échelle encore jamais égalée. Dans un monde aux changements fréquents et brutaux, il est indispensable de construire les bons outils pour relever nos défis collectifs.

The functions of the IEML database

The IEML database is a database to record the USLs (the IEML expressions) with their set of metadata.

In this blog post I will describe the different roles of this database.

Centralize the language’s knowledge

The main function of the IEML database is to collect past interpretations of IEML expressions, for documentation purposes, to ensure consistency of interpretations and to build statistically significant training sets.

An example of USL : **“to fly off”** or “**to go up in the air freely**“

Record the described USLs

The database entries are the described USLs, i.e. an USL (an IEML expression) with a set of metadata, called descriptors. These descriptors can exist in all languages (in French and English for the moment, identified by ISO 639), and can take several distinct types of values:

A set of translations: a verbatim example of a use of the concept of the USL in a particular language. This field can be shared by several USLs if there are homonyms. In this case, we try to disambiguate the translation with indications of context (in brackets). Ex: “to fly off (movement)”
A definition/comment: a defining sentence that helps finding the meaning of the USL. This field must be unique in the whole database, there cannot be two IEML expressions with the same meaning. This field is used to document the USLs, and to justify their particular construction. Ex: “to fly off” : “to go up in the air freely”
A set of tags: This field is used to organize IEML expressions in the database editor.

All the entries in the database are described USLs. However, the list of descriptors is open to some addition of new metadata types, such as wikidata (or wiktionary) URIs, links to resources (e.g. definition by example, photo), or other resources allowing to ground the meaning of the USLs.

Maintaining language coherence

The meaning of a text, its interpretation, can vary from one person to another, depending on many factors, such as culture and native language. The same is true for an USL, which is interpreted as a small text, and is therefore subject to the interpretative variation of the different individuals. This problem brings us to the second advantage of gathering all these USLs in the same database with their descriptors: to have a unified repository where consistency tests can be performed.

Indeed, in IEML, one can easily compute semantic relations between concepts from their scripts, and thus represent the database as a graph of semantic relationships, and it becomes possible to perform computations on the set of concepts as a system (a language from a synchronic point of view).

The database coherence comes from the alignment between the network of semantic relationships of the USLs, and the observed network of semantic relationships of their descriptors. Since this property is not always automatically computable, we have designated a consistent set of tests to verify the alignement properties. These properties, if checked together, will best guarantee the consistency of the database. Briefly, these properties relate to :

The match between the semantic proximity (in natural language) of the translations and the proximity between the USLs. The syntactically close USLs must be semantically close.
The correct use of the IEML grammar (the respect of the composition, transitivity and commutativity of operations …) as well as the morphemes meanings. The same IEML constructions must have the same meaning in different contexts.
The guarantee that none of the USLs have the same meaning, as there are no synonyms or homonyms in IEML.

These consistency checks are an integral part of the database editing process. I will detail in a future blog post this editing process as well as the properties and the consistency tests.

Enabling the evolution of the language

The uses of a particular language change with people and times. To give IEML a chance to become part of the collective life, mechanisms for flexible editing of the database must be put in place.

This is why we chose, to support the database, a versioning tool promoting collaboration: the git protocol. This tool save the different versions of the language and maintain editable copies of the database (branches) to modify the conceptual structure without impacting the totality of the users.

Recording successive language versions

As mentioned earlier, git allows you to record all the intermediate states of the database and thus give a diachronic view of the language. Each new state in the database is named by a “hash” (e.g., 9af91324f0adf…) and can therefore be referenced in applications as a particular state in the history of the database’s conceptual graph.

Example of a database modification graph, each circle represents a modification and also a state of the database, each modification is identified by a commit hash, a message and an author. The branch mechanism allows users to modify the database without impacting the other users.

Enable collaborative editing

The database is hosted on the Github platform, allowing users to take advantage of their ecosystem of collaborative tools. For example, errors can be reported through “issues” (a forum to describe, discuss and find solutions to bugs), the database can be “forked” by a github user (make a branch) to be then proposed as a pull-request (make a merge).

The editorial process integrates all the steps of database editing via Github.

Disseminate the language

A language is living only when it is used. To encourage IEML adoption, we organized the database as a tool supporting the collaborative learning and dissemination of the language.

Produce a learning and research resource

The database, designed to contain hundreds of thousands of IEML concepts, is open-source and easily navigable with the Intlekt editor. Novices and autodidacts alike, when learning the IEML grammar, can draw on a large number of examples here.

Encouraging the use of IEML in a production context

The USL database is intended to serve as an open semantic repository for diverse applications. The format is very simple (scv: space separated values), its content is easily downloadable and is readable with any programming language. Moreover, a python API is available to manipulate the database.

I have just described the main functions of the database, in a future post I will talk about its editing, through the process of modification by a user, as well as consistency testing. In this regard, a future version of the IEML library will integrate an error correction mechanism that will correct in a transparent way the consistency errors of the end user.

Conclusion of the IEML seminar

This fall, Pierre Lévy the inventor of the IEML language, gave the first workshop devoted to it at the University of Montreal. This event could be organized thanks to the investment of the philosopher and holder of a research chair in digital textualities, Marcello Vitali Rosati.

This workshop was the opportunity for Pierre Lévy to presents his work on the IEML grammar and dictionary to researchers of different fields of expertise, that are aware of the challenges of the semantic web. IEML is part of the field of digital humanities research, which seeks to transcend the processes of writing and publishing articles in human science by making the best use of computer tools. This event gathered different profiles, from philosophy and literacy to deep learning NLP researchers. Some people from the industry (NLP and HR) also attended the workshop.

This allowed Pierre and me to confront our work with their perspectives, and to reflect on new possibilities. I have summarized the main achievements of the workshop in the following list :

The seminar began with a detailed explanation of the IEML grammar. All the sessions were recorded in order to start building an online learning resource database (only in French for the moment).
The participants were convinced of IEML’s expressive capability. Pierre demonstrated the writing of words in IEML and the power of subtle nuance that this language can offer. Participants also had the opportunity to write words to convince themselves of the reproducibility of these results.
The intlekt tool for editing IEML expression, which I designed, has been tested by a larger audience. Users were enthusiastic about using it, and their feedback allowed me to improve the UI. I have had positive feedback on the quality of the tool.

“Constitutional monarchy” written in IEML by Pierre Lévy with dev.intlekt.io

This workshop was a success, the community of IEML enthusiasts has grown, and we have started new collaborations on exciting research topics :

Marcello Vitali Rosati wants to be involved in disseminating writing skills in IEML to researchers in the humanities. A first version of the tag lexicon (~200 tags) of the journal Sens Public that he directs was translated in IEML, and a project to translate the tag lexicon of the article search engine Isidore is planned. Marcello also wants to integrate IEML into his innovative article editing application Stylo.
Vincent Letard, a researcher in natural language processing, wishes to explore the use of formal analogies on IEML expressions. The semantics and syntax of IEML expression being the same, this approach may open the way to compute semantic analogies in a deterministic manner.
David Alfonso Hermelo, a NLP researcher and linguist, wishes to explore methods of interfacing the IEML database with wiktionary.
Nicolas Chausseau, a NLP researcher, wants to explore the use of statistical learned deep language models to augment the IEML database.
We also discussed with Emmanuel Chateau Dutier about the definition of standard to give an unique IRI to each IEML expression, in order to be able to use IEML with the Semantic Web technologies.

Vincent Letard, David Alfonso Hermlo and Louis van Beurden working on formal analogies.

This workshop was a great success and we are likely to repeat the initiative next year.