-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 4c1f4bf
Showing
1 changed file
with
376 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,376 @@ | ||
<div class="tabu"> | ||
<h1 class="unnumbered" id="language-skills">Maxime Fily, Linguist and Engineer</h1> | ||
<img alt="Mugshot" src="/home/mfily/Documents/ancien_espace/personal_files/perso/maxime_neige_Datsha_crop.jpg" width="360" /> | ||
<h1 class="unnumbered" id="language-skills">Language skills</h1> | ||
<table> | ||
<tbody> | ||
<tr class="odd"> | ||
<td style="text-align: left;">French:</td> | ||
<td style="text-align: left;">Native speaker</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;">English:</td> | ||
<td style="text-align: left;">C1</td> | ||
</tr> | ||
<tr class="odd"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;"><span>↪</span> Bright test: 4.5/5 | ||
(2007)</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;"><span>↪</span> TOEFL test: 627/677 | ||
(2005)</td> | ||
</tr> | ||
<tr class="odd"> | ||
<td style="text-align: left;">Mandarin:</td> | ||
<td style="text-align: left;">B2/C1</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;"><span>↪</span> DCL, B2 level, (2017)</td> | ||
</tr> | ||
<tr class="odd"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;"><span>↪</span> HSK level 5 : 214/300 | ||
(2017)</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;"><span>↪</span> HSK level 3 : 299/300 | ||
(2015)</td> | ||
</tr> | ||
<tr class="odd"> | ||
<td style="text-align: left;">Spanish:</td> | ||
<td style="text-align: left;">Intermediate</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;">Na (Narua):</td> | ||
<td style="text-align: left;">Working knowledge</td> | ||
</tr> | ||
</tbody> | ||
</table> | ||
<p style="width: 500px;"><br /> | ||
</p> | ||
<h1 class="unnumbered" id="computer-skills">Computer skills</h1> | ||
<table> | ||
<tbody> | ||
<tr class="odd"> | ||
<td style="text-align: left;">General IT:</td> | ||
<td style="text-align: left;">GNU/Linux systems administration (Ubuntu, | ||
Mint)</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;">LaTeX, Libre Office, MS Office</td> | ||
</tr> | ||
<tr class="odd"> | ||
<td style="text-align: left;">Programming:</td> | ||
<td style="text-align: left;">Linux/Unix : Bash, csh, Ksh</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;">General: Python, C, xml, JuPyteR</td> | ||
</tr> | ||
<tr class="odd"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;">Speech processing: Praat</td> | ||
</tr> | ||
<tr class="even"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;">NLP: basic knowledge of Transformers-based | ||
neural networks</td> | ||
</tr> | ||
<tr class="odd"> | ||
<td style="text-align: left;"></td> | ||
<td style="text-align: left;">Statistics: R</td> | ||
</tr> | ||
</tbody> | ||
</table> | ||
<p style="width: 500px;"><br /> | ||
</p> | ||
<h1 class="unnumbered" id="soft-skills">Soft skills</h1> | ||
<ul> | ||
<li><p style="width: 500px;">Autonomy, adaptability</p></li> | ||
<li><p style="width: 500px;">Reliability, solicitude</p></li> | ||
<li><p style="width: 500px;">Teamwork, collaborative work</p></li> | ||
<li><p style="width: 500px;">Oral and written communication</p></li> | ||
</ul> | ||
<h1 class="unnumbered" id="education">Education</h1> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018 - 2022:</span></p> | ||
<p style="width: 500px;">PhD in Phonetics, Phonology and Speech Sciences, Université Paris III | ||
Sorbonne Nouvelle (defended on Dec. 2, 2022).<br /> | ||
</p> | ||
</div> | ||
<p style="width: 500px;"><span>↪</span> <u>PhD Dissertation</u>:</p> | ||
<div class="refsection"> | ||
|
||
</div> | ||
<p style="width: 500px;"><u>Abstract</u>: This thesis constitutes a description and analysis | ||
of the dialect of the Na (Mosuo) language (Sino-Tibetan/Trans-Himalayan | ||
family) spoken in the village of Shekua. After a brief presentation of | ||
the segmental phonology (consonants and vowels), the main part of the | ||
analysis focuses on the tonal system. Based on previous work and | ||
first-hand immersion fieldwork, it is confirmed that the system is based | ||
on two tonal levels (High and Low). The system is explored in a | ||
systematic manner, starting out from the tones of nouns and progressing | ||
to those of compound nouns, verbs, and various morphosyntactic | ||
constructions. The analysis of the tonal system brings out categories | ||
that can only be revealed by combining information from several | ||
contexts. Seven categories are found for monosyllabic nouns, and no less | ||
than twelve for (monosyllabic) verbs. A contrastive use of the HL | ||
contour, not found among previously documented Na dialects, makes | ||
functional sense inside a two-level system, as it puts the available | ||
phonological units to maximum use. Three-level systems (contrasting H, | ||
M, and L) such as that of the village of Alawa (Yongning) allow for a | ||
wider combinatorial range, and hence are under less pressure to exploit | ||
every nook and cranny of this combinatorial range. This monograph on a | ||
Na tonal system, combined with a fully open access corpus, adds to the | ||
literature on a language whose traditional chain of transmission is | ||
undoubtedly threatened.<br /> | ||
</p> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018 - 2022:</span></p> | ||
<p style="width: 500px;">Master’s degree in Linguistics, with a specialization in the field of | ||
experimental phonetics and phonology, Université Grenoble Alpes (<span | ||
class="smallcaps">uga</span>), “mention Très Bien”.<br /> | ||
</p> | ||
</div> | ||
<p style="width: 500px;"><span>↪</span> <u>Master’s thesis</u>:</p> | ||
<div class="refsection"> | ||
|
||
</div> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018 - 2022:</span></p> | ||
<p style="width: 500px;">Level 3 University degree (“DU niveau 3”) in Chinese language and | ||
culture, Université Lyon 3, “mention Très Bien”.<br /> | ||
</p> | ||
<p style="width: 500px;">Master of Engineering, <span class="smallcaps">enspg</span> (“École | ||
Nationale Supérieure de Physique de Grenoble”, renamed PHELMA in 2008), | ||
member of Grenoble INP group (Grenoble Institute of Engineering, a | ||
public sector technology university), “mention Assez Bien”.<br /> | ||
</p> | ||
</div> | ||
<h1 class="unnumbered" id="professional-experiences">Professional | ||
experiences</h1> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018 - 2022:</span></p> | ||
<p style="width: 500px;">Post-doc at <span class="smallcaps">cnrs</span>, <span | ||
class="smallcaps">llf</span>, Laboratoire de Linguistique Formelle and | ||
<span class="smallcaps">Lacito</span>, Langues et Civilisations à | ||
Tradition Orale.<br /> | ||
</p> | ||
<p style="width: 500px;">Supervisor for the internship of Université Sorbonne Nouvelle BA | ||
student Berthilde Biard, on the subject of data extraction and analysis | ||
for the buildup of verbal and nominal paradigms in Na languages | ||
(3months).<br /> | ||
</p> | ||
<p style="width: 500px;">Doctoral student at <span class="smallcaps">lacito</span> (Villejuif) | ||
and <span class="smallcaps">Gipsa-lab</span> (Grenoble).<br /> | ||
</p> | ||
<p style="width: 500px;">Mandarin teacher, private classes, 2h/week.<br /> | ||
</p> | ||
<p style="width: 500px;">Temporary lecturer for the class “Introduction to linguistics and | ||
language families”, <span class="smallcaps">uga</span> (12 hours).<br /> | ||
</p> | ||
<p style="width: 500px;">Research Internship at <span class="smallcaps">Gipsa-lab</span> in | ||
the frame of my Master 2 in Linguistics. Design of an oral-nasal signals | ||
separate recording plate for the acoustic study of the oral and nasal | ||
cues of nasalised sounds in French and Taiwan Mandarin.<br /> | ||
</p> | ||
<p style="width: 500px;">Confirmed Engineer at <span class="smallcaps">Areva NP</span>. | ||
Neutronic design of nuclear fuel assemblies, Lyon, France.<br /> | ||
</p> | ||
<p style="width: 500px;">Confirmed Engineer at <span class="smallcaps">Wecan</span>, an <span | ||
class="smallcaps">Areva-CGN</span> (<em>China General Nuclear</em>) | ||
joint-venture specializing in nuclear design and safety, Shenzhen, | ||
China.<br /> | ||
</p> | ||
<p style="width: 500px;">Engineer (“Ingénieur d’étude”) at <span class="smallcaps">Areva | ||
NP</span>, in nuclear safety analyses, Paris, France.<br /> | ||
</p> | ||
</div> | ||
<h1 class="unnumbered" id="linguistic-fieldwork">Linguistic | ||
fieldwork</h1> | ||
<p style="width: 500px;">Since the beginning of my work as a field linguist, I have focused on | ||
the Shekua variety of the Na language, also known as Lataddi Narua. | ||
Shekua is a small village situated in proximity to the Grass Sea, and | ||
the speakers in this village use a variety that is closely related to | ||
Yongning Narua, whose tone system has been described in Michaud (2017). | ||
The detail of my fieldwork experience is outlined below:</p> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018 - 2022:</span></p> | ||
<p style="width: 500px;">Interviews of Shekua Na speakers : narratives, phonological | ||
confirmation paradigms, dialogues (Yunnan, 2 months).</p> | ||
<p style="width: 500px;">Interviews of Shekua Na speakers : phonetics and phonology, | ||
narratives (Sichuan and Yunnan, 3 months).</p> | ||
</div> | ||
<h1 class="unnumbered" id="foreign-collaborations">Foreign | ||
collaborations</h1> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018:</span></p> | ||
<p style="width: 500px;">I undertook a research trip to Kunming “Yunnan Minzu University”, | ||
organized at the invitation of M. He Likun and M. Liu Jinrong (School of | ||
Ethnic Cultures). During this visit I attended seminars that covered a | ||
range of descriptive works on Yunnan languages. Additionnally, I | ||
presented on the phonological system of Shekua Na.</p> | ||
</div> | ||
<h1 class="unnumbered" id="technical-achievements">Technical | ||
achievements</h1> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018:</span></p> | ||
<p style="width: 500px;">Creation of a fully customizable keyboard for Linux users interested | ||
in writing with the International Phonetic Alphabet. See <a | ||
href="https://lacito.hypotheses.org/3086" | ||
class="uri">https://lacito.hypotheses.org/3086</a></p> | ||
<p style="width: 500px;">Realization of a solution to allow the conversion of praat textgrids | ||
to <span class="smallcaps">xml</span> format to accelerate Pangloss | ||
deposits. See <a href="https://github.com/maxime-fily/TXTGRD2XML" | ||
class="uri">https://github.com/maxime-fily/TXTGRD2XML</a></p> | ||
<p style="width: 500px;">Design of the acquisition module for a separate recording of oral and | ||
nasal tract output, by modifying the Glottal Enterprise nasalance | ||
plate.</p> | ||
</div> | ||
<h1 class="unnumbered" id="grants-and-projects">Grants and projects</h1> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>Since 2018:</span></p> | ||
<p style="width: 500px;">I obtained a grant to perform part of my research abroad, via the | ||
<span class="smallcaps">uga idex</span> International Mobility Program. | ||
The funding allowed a field trip and a research visit to Yunnan and | ||
Sichuan.</p> | ||
<p style="width: 500px;">Participant in the Labex-EFL program, within the “Typology and | ||
dynamics of linguistic systems” strand. Since 2023, I also participate | ||
in the “Computational Linguistics” strand. The goal is to address | ||
computational linguistics through a cross-disciplinary approach, | ||
enabling the deployment of NLP methods across the various linguistic | ||
strands of the Labex.</p> | ||
</div> | ||
<h1 class="unnumbered" | ||
id="publications-in-conferences-with-proceedings">Publications</h1> | ||
|
||
<div class="refsection"> | ||
<p style="width: 500px;"> See my Google scholar profile : </p> | ||
<a href="https://scholar.google.fr/citations?user=XYvTKbIAAAAJ&hl=fr&oi=ao">Maxime Fily</a> | ||
</div> | ||
<h1 class="unnumbered" id="ongoing-research-activity">Ongoing research | ||
activity</h1> | ||
<h2 class="unnumbered" id="comparative-work-on-na-dialects">Comparative | ||
work on Na dialects</h2> | ||
<p style="width: 500px;">I am currently in the field, collecting data to finalize a study on | ||
tonal correspondences between Yongning Na and Shekua/Lataddi Na. The | ||
study focuses on noun and verb categories in the two dialects. The main | ||
difference between these two dialects lies in their tone systems: | ||
Yongning has a three-level tone system, while Shekua/Lataddi has a | ||
two-level tone system. Lexically, the two dialects are very close, and | ||
moreover, they are mutually intelligible. This pushes for a tonal | ||
comparison: our approach aims to identify precise tonal correspondences | ||
between a two-level system and a three-level system, to potentially shed | ||
light on rearrangement mechanisms. A preliminary study covered 52 verbs | ||
and 51 nouns, and the objective is to expand this dataset to 100 verbs | ||
and nouns for a comprehensive study. The challenge in this task arises | ||
from the <em>morpho-tonological complexity</em> of tonal categories in | ||
Na, which can only be accessed in combination with other morphemes. For | ||
instance, verbs are examined in combination with TAME particles, nouns | ||
within Object-Verb structures and compound nouns.</p> | ||
<p style="width: 500px;">For this study, we utilize carefully curated data and rely on | ||
statistical tools such as Sankey Diagrams and Agglomerative Hierarchical | ||
Clustering. These tools offer evidence-based and interpretable graphs, | ||
contributing to opening discussions on categories. Additionally, the | ||
study draws on first-hand field data, all of which is accessible online | ||
via the Pangloss website.</p> | ||
<p style="width: 500px;">The results of the study mark a first step towards reconstructing the | ||
diachronic processes that led to contemporary Na tone systems. | ||
Simultaneously, a search for cognates is underway between Naish | ||
languages and the conservative languages of the Trans-Himalayan | ||
(Sino-Tibetan) family, particularly Gyalrong languages. This research is | ||
conducted with guidance from Alexis Michaud (<span | ||
class="smallcaps">Lacito</span>) and advice from Guillaume Jacques | ||
(CRLAO). Furthermore, we closely follow the works of Li Zihe (Peking | ||
University), who has undertaken the task of assembling that cognate list | ||
in Naish, Gyalrong, Old Burmese, Old Tibetan and Written Tibetan | ||
languages.</p> | ||
<h2 class="unnumbered" id="nlp-for-underdocumented-languages">NLP for | ||
un(der)documented languages</h2> | ||
<p style="width: 500px;">My post-doctoral activities over the past year (and until August | ||
2024) focus on evaluating the deployment cost of NLP methods for | ||
un(der)documented languages. Our work is grounded in the understanding | ||
that the rapid disappearance of rare and undocumented languages poses a | ||
monumental challenge for linguists involved in documenting these | ||
languages. Language models can assist linguists, but since they are | ||
pre-trained on a closed list of resource-rich languages, they need to be | ||
<em>fine-tuned</em> to be able to perform NLP tasks on a language not | ||
seen in pre-training. Before I started this position, a successful | ||
fine-tuning experiment was conducted on Yongning Na (glottocode: | ||
yong1234) and Japhug (glottocode: japh1234), with Word-Error Rates well | ||
within acceptable values for an Automatic Speech Recognition (ASR) task | ||
(Guillaume, Wisniewski, Macaire, et al. 2022a). However, the study | ||
revealed that the quality and diversity of the fine-tuning data impacted | ||
the system’s performance. For instance, a system fine-tuned on | ||
elicitations exhibited poor performance on texts, and vice versa.</p> | ||
<p style="width: 500px;">To better design minimal sets of audio data for input into a language | ||
model, specifically for ASR tasks – such as phonetic and phonological | ||
transcriptions in our case –, my assignment consisted in developing | ||
distance measurement methods in the vectorial representation space. This | ||
allowed us to assess similarities between recordings. With the | ||
assistance of the developed toolkit, our aim is to predict the | ||
performance of a fine-tuning task based on the characteristics of the | ||
input and tested materials.</p> | ||
<p style="width: 500px;">This research is highly interdisciplinary, engaging both NLP | ||
scientists seeking a better understanding of language models based on | ||
neural networks and linguists who stand to benefit from language models | ||
better tailored to their needs. Neural networks are producing | ||
increasingly powerful models for academia, but the process needs to be | ||
better understood, especially by the linguists, who have the expertise | ||
to assess the accuracy of the language models produced recently. The | ||
community can benefit from these models by accelerating time-consuming | ||
stages of work, such as transcription. Thus, it seemed essential to | ||
enter the NLP arena as a linguist. Guillaume Wisniewski (LLF) is the | ||
principal investigator for the project, with a dedicated focus on the | ||
rare languages of Yunnan and Sichuan, leaving ample room for future | ||
collaborations within my area of linguistic interest.</p> | ||
<p style="width: 500px;"><br /> | ||
</p> | ||
<h1 class="unnumbered" id="academic-service">Academic service</h1> | ||
<h2 class="unnumbered" id="article-reviews">Article reviews</h2> | ||
<p style="width: 500px;">Peer-reviews in the field of phonetics and phonology, for the | ||
following conferences: Interspeech, “Journées d’Étude de la Parole”.</p> | ||
<h2 class="unnumbered" id="collective-duties">Collective duties</h2> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018 - 2022:</span></p> | ||
<p style="width: 500px;">Representative for the doctoral students at <span | ||
class="smallcaps">lacito</span>.</p> | ||
<p style="width: 500px;">Representative for the doctoral students at <span | ||
class="smallcaps">Gipsa-lab</span> and <span | ||
class="smallcaps">uga</span>.</p> | ||
<p style="width: 500px;">Representative for the doctoral students at <span | ||
class="smallcaps">Grenoble-INP</span> <span | ||
class="smallcaps">clhsct</span>.</p> | ||
<p style="width: 500px;">Workplace First-Aid Rescuer (retraining planned in 2024).</p> | ||
</div> | ||
<p style="width: 500px;"><br /> | ||
</p> | ||
<h1 class="unnumbered" id="learned-societies">Learned societies</h1> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>2018 - 2022:</span></p> | ||
<p style="width: 500px;">Member of the “Association Française de Communication Parlée”.</p> | ||
<p style="width: 500px;">Member of the “Société de Linguistique de Paris”.</p> | ||
<p style="width: 500px;">Member of the International Phonetic Association.</p> | ||
</div> | ||
<h1 class="unnumbered" id="references">References</h1> | ||
<div class="compactlabel"> | ||
<p style="width: 500px;"><span>Linguistics </span></p> | ||
<p style="width: 500px;">M. Alexis Michaud<br /> | ||
Research Director, <span class="smallcaps">cnrs</span><br /> | ||
<span class="smallcaps">Lacito</span>, 7 rue G. Môquet, Villejuif<br /> | ||
Phone: (+33) (0)1 49 58 37 78<br /> | ||
E-mail: [email protected]<br /> | ||
</p> | ||
<p style="width: 500px;">M. Guillaume Wisniewski<br /> | ||
Assistant Professor, Université Paris-Cité<br /> | ||
LLF, 8 rue A. Einstein, Paris<br /> | ||
Phone: (+33) (0)1 57 27 57 64<br /> | ||
E-mail: [email protected]<br /> | ||
</p> | ||
</div> | ||
</div> |