Patrick Paroubek, LISN-Laboratoire Interdisciplinaire des Sciences du Numérique, CNRS-U.
Extraction d’information et analyse de sentiments guidée par les aspects (Aspect Based Sentiment Analysis ou ABSA)
Patrick Paroubek du laboratoire LISN (CNRS-U.Paris-Saclay) fera un retour d’expérience sur des travaux de recherche actuelles à la frontière des domaines du Traitement Automatique des Langues, (TAL) de l’Economie et de la Finance. Il abordera plus particulièrement des questions en rapport avec l’extraction d’information et l’analyse de sentiments guidée par les aspects (Aspect Based Sentiment Analysis ou ABSA) dans le cadre de deux thèses de doctorat en cours. Ce sera l’occasion de voir quelles ressources existent pour la langue française et quels progrès laissent envisager les avancées récentes en apprentissage neuronal profond.
Wissal El Achouri, Algoan
Credit decisioning : enrich open banking data using NLP
Since January 2018, an EU directive called PSD2 requested banks to enable the access to their data in a secure and standardised way, so that it can be more easily shared by customers. This directive enables the Open Banking in Europe.
The Open Banking data is a revolution for many financial services, and especially for lending. Algoan leverages the Open Banking to offer a credit decisioning API for financial institutions to help make adequate credit decisions for consumer loans.
In this talk, we focus on one of the enrichments that Algoan has been able to bring to Open Banking data: the categorisation of transactions. By categorisation, we refer to the process that associates a bank transaction to a category. A category describes the reason why a transaction has been executed.
Categorising transactions is a necessary step for making automatic and accurate credit decisions.
This problem is an NLP task, however it differs from most other NLP tasks in that the text related to a transaction is not structured as a human spoken language. Moreover, there are many challenges:
– the selection and labelling of a high volume of data,
– the design of a highly performant categorisation engine that covers the most transactions,
– the development of an efficient maintenance system to preserve a high level of precision in production
– and ensuring that the entire pipeline (labelling/training/deploying/ monitoring) is scalable internationally for foreign languages.
During this talk, we are going to explain the process that we have adopted to overcome these challenges and end up with a performant, well-monitored and scalable categorisation engine.