Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
maxime-fily committed Jun 3, 2024
0 parents commit 4c1f4bf
Showing 1 changed file with 376 additions and 0 deletions.
376 changes: 376 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,376 @@
<div class="tabu">
<h1 class="unnumbered" id="language-skills">Maxime Fily, Linguist and Engineer</h1>
<img alt="Mugshot" src="/home/mfily/Documents/ancien_espace/personal_files/perso/maxime_neige_Datsha_crop.jpg" width="360" />
<h1 class="unnumbered" id="language-skills">Language skills</h1>
<table>
<tbody>
<tr class="odd">
<td style="text-align: left;">French:</td>
<td style="text-align: left;">Native speaker</td>
</tr>
<tr class="even">
<td style="text-align: left;">English:</td>
<td style="text-align: left;">C1</td>
</tr>
<tr class="odd">
<td style="text-align: left;"></td>
<td style="text-align: left;"><span></span> Bright test: 4.5/5
(2007)</td>
</tr>
<tr class="even">
<td style="text-align: left;"></td>
<td style="text-align: left;"><span></span> TOEFL test: 627/677
(2005)</td>
</tr>
<tr class="odd">
<td style="text-align: left;">Mandarin:</td>
<td style="text-align: left;">B2/C1</td>
</tr>
<tr class="even">
<td style="text-align: left;"></td>
<td style="text-align: left;"><span></span> DCL, B2 level, (2017)</td>
</tr>
<tr class="odd">
<td style="text-align: left;"></td>
<td style="text-align: left;"><span></span> HSK level 5 : 214/300
(2017)</td>
</tr>
<tr class="even">
<td style="text-align: left;"></td>
<td style="text-align: left;"><span></span> HSK level 3 : 299/300
(2015)</td>
</tr>
<tr class="odd">
<td style="text-align: left;">Spanish:</td>
<td style="text-align: left;">Intermediate</td>
</tr>
<tr class="even">
<td style="text-align: left;">Na (Narua):</td>
<td style="text-align: left;">Working knowledge</td>
</tr>
</tbody>
</table>
<p style="width: 500px;"><br />
</p>
<h1 class="unnumbered" id="computer-skills">Computer skills</h1>
<table>
<tbody>
<tr class="odd">
<td style="text-align: left;">General IT:</td>
<td style="text-align: left;">GNU/Linux systems administration (Ubuntu,
Mint)</td>
</tr>
<tr class="even">
<td style="text-align: left;"></td>
<td style="text-align: left;">LaTeX, Libre Office, MS Office</td>
</tr>
<tr class="odd">
<td style="text-align: left;">Programming:</td>
<td style="text-align: left;">Linux/Unix : Bash, csh, Ksh</td>
</tr>
<tr class="even">
<td style="text-align: left;"></td>
<td style="text-align: left;">General: Python, C, xml, JuPyteR</td>
</tr>
<tr class="odd">
<td style="text-align: left;"></td>
<td style="text-align: left;">Speech processing: Praat</td>
</tr>
<tr class="even">
<td style="text-align: left;"></td>
<td style="text-align: left;">NLP: basic knowledge of Transformers-based
neural networks</td>
</tr>
<tr class="odd">
<td style="text-align: left;"></td>
<td style="text-align: left;">Statistics: R</td>
</tr>
</tbody>
</table>
<p style="width: 500px;"><br />
</p>
<h1 class="unnumbered" id="soft-skills">Soft skills</h1>
<ul>
<li><p style="width: 500px;">Autonomy, adaptability</p></li>
<li><p style="width: 500px;">Reliability, solicitude</p></li>
<li><p style="width: 500px;">Teamwork, collaborative work</p></li>
<li><p style="width: 500px;">Oral and written communication</p></li>
</ul>
<h1 class="unnumbered" id="education">Education</h1>
<div class="compactlabel">
<p style="width: 500px;"><span>2018 - 2022:</span></p>
<p style="width: 500px;">PhD in Phonetics, Phonology and Speech Sciences, Université Paris III
Sorbonne Nouvelle (defended on Dec. 2, 2022).<br />
</p>
</div>
<p style="width: 500px;"><span></span> <u>PhD Dissertation</u>:</p>
<div class="refsection">

</div>
<p style="width: 500px;"><u>Abstract</u>: This thesis constitutes a description and analysis
of the dialect of the Na (Mosuo) language (Sino-Tibetan/Trans-Himalayan
family) spoken in the village of Shekua. After a brief presentation of
the segmental phonology (consonants and vowels), the main part of the
analysis focuses on the tonal system. Based on previous work and
first-hand immersion fieldwork, it is confirmed that the system is based
on two tonal levels (High and Low). The system is explored in a
systematic manner, starting out from the tones of nouns and progressing
to those of compound nouns, verbs, and various morphosyntactic
constructions. The analysis of the tonal system brings out categories
that can only be revealed by combining information from several
contexts. Seven categories are found for monosyllabic nouns, and no less
than twelve for (monosyllabic) verbs. A contrastive use of the HL
contour, not found among previously documented Na dialects, makes
functional sense inside a two-level system, as it puts the available
phonological units to maximum use. Three-level systems (contrasting H,
M, and L) such as that of the village of Alawa (Yongning) allow for a
wider combinatorial range, and hence are under less pressure to exploit
every nook and cranny of this combinatorial range. This monograph on a
Na tonal system, combined with a fully open access corpus, adds to the
literature on a language whose traditional chain of transmission is
undoubtedly threatened.<br />
</p>
<div class="compactlabel">
<p style="width: 500px;"><span>2018 - 2022:</span></p>
<p style="width: 500px;">Master’s degree in Linguistics, with a specialization in the field of
experimental phonetics and phonology, Université Grenoble Alpes (<span
class="smallcaps">uga</span>), “mention Très Bien”.<br />
</p>
</div>
<p style="width: 500px;"><span></span> <u>Master’s thesis</u>:</p>
<div class="refsection">

</div>
<div class="compactlabel">
<p style="width: 500px;"><span>2018 - 2022:</span></p>
<p style="width: 500px;">Level 3 University degree (“DU niveau 3”) in Chinese language and
culture, Université Lyon 3, “mention Très Bien”.<br />
</p>
<p style="width: 500px;">Master of Engineering, <span class="smallcaps">enspg</span> (“École
Nationale Supérieure de Physique de Grenoble”, renamed PHELMA in 2008),
member of Grenoble INP group (Grenoble Institute of Engineering, a
public sector technology university), “mention Assez Bien”.<br />
</p>
</div>
<h1 class="unnumbered" id="professional-experiences">Professional
experiences</h1>
<div class="compactlabel">
<p style="width: 500px;"><span>2018 - 2022:</span></p>
<p style="width: 500px;">Post-doc at <span class="smallcaps">cnrs</span>, <span
class="smallcaps">llf</span>, Laboratoire de Linguistique Formelle and
<span class="smallcaps">Lacito</span>, Langues et Civilisations à
Tradition Orale.<br />
</p>
<p style="width: 500px;">Supervisor for the internship of Université Sorbonne Nouvelle BA
student Berthilde Biard, on the subject of data extraction and analysis
for the buildup of verbal and nominal paradigms in Na languages
(3months).<br />
</p>
<p style="width: 500px;">Doctoral student at <span class="smallcaps">lacito</span> (Villejuif)
and <span class="smallcaps">Gipsa-lab</span> (Grenoble).<br />
</p>
<p style="width: 500px;">Mandarin teacher, private classes, 2h/week.<br />
</p>
<p style="width: 500px;">Temporary lecturer for the class “Introduction to linguistics and
language families”, <span class="smallcaps">uga</span> (12 hours).<br />
</p>
<p style="width: 500px;">Research Internship at <span class="smallcaps">Gipsa-lab</span> in
the frame of my Master 2 in Linguistics. Design of an oral-nasal signals
separate recording plate for the acoustic study of the oral and nasal
cues of nasalised sounds in French and Taiwan Mandarin.<br />
</p>
<p style="width: 500px;">Confirmed Engineer at <span class="smallcaps">Areva NP</span>.
Neutronic design of nuclear fuel assemblies, Lyon, France.<br />
</p>
<p style="width: 500px;">Confirmed Engineer at <span class="smallcaps">Wecan</span>, an <span
class="smallcaps">Areva-CGN</span> (<em>China General Nuclear</em>)
joint-venture specializing in nuclear design and safety, Shenzhen,
China.<br />
</p>
<p style="width: 500px;">Engineer (“Ingénieur d’étude”) at <span class="smallcaps">Areva
NP</span>, in nuclear safety analyses, Paris, France.<br />
</p>
</div>
<h1 class="unnumbered" id="linguistic-fieldwork">Linguistic
fieldwork</h1>
<p style="width: 500px;">Since the beginning of my work as a field linguist, I have focused on
the Shekua variety of the Na language, also known as Lataddi Narua.
Shekua is a small village situated in proximity to the Grass Sea, and
the speakers in this village use a variety that is closely related to
Yongning Narua, whose tone system has been described in Michaud (2017).
The detail of my fieldwork experience is outlined below:</p>
<div class="compactlabel">
<p style="width: 500px;"><span>2018 - 2022:</span></p>
<p style="width: 500px;">Interviews of Shekua Na speakers : narratives, phonological
confirmation paradigms, dialogues (Yunnan, 2 months).</p>
<p style="width: 500px;">Interviews of Shekua Na speakers : phonetics and phonology,
narratives (Sichuan and Yunnan, 3 months).</p>
</div>
<h1 class="unnumbered" id="foreign-collaborations">Foreign
collaborations</h1>
<div class="compactlabel">
<p style="width: 500px;"><span>2018:</span></p>
<p style="width: 500px;">I undertook a research trip to Kunming “Yunnan Minzu University”,
organized at the invitation of M. He Likun and M. Liu Jinrong (School of
Ethnic Cultures). During this visit I attended seminars that covered a
range of descriptive works on Yunnan languages. Additionnally, I
presented on the phonological system of Shekua Na.</p>
</div>
<h1 class="unnumbered" id="technical-achievements">Technical
achievements</h1>
<div class="compactlabel">
<p style="width: 500px;"><span>2018:</span></p>
<p style="width: 500px;">Creation of a fully customizable keyboard for Linux users interested
in writing with the International Phonetic Alphabet. See <a
href="https://lacito.hypotheses.org/3086"
class="uri">https://lacito.hypotheses.org/3086</a></p>
<p style="width: 500px;">Realization of a solution to allow the conversion of praat textgrids
to <span class="smallcaps">xml</span> format to accelerate Pangloss
deposits. See <a href="https://github.com/maxime-fily/TXTGRD2XML"
class="uri">https://github.com/maxime-fily/TXTGRD2XML</a></p>
<p style="width: 500px;">Design of the acquisition module for a separate recording of oral and
nasal tract output, by modifying the Glottal Enterprise nasalance
plate.</p>
</div>
<h1 class="unnumbered" id="grants-and-projects">Grants and projects</h1>
<div class="compactlabel">
<p style="width: 500px;"><span>Since 2018:</span></p>
<p style="width: 500px;">I obtained a grant to perform part of my research abroad, via the
<span class="smallcaps">uga idex</span> International Mobility Program.
The funding allowed a field trip and a research visit to Yunnan and
Sichuan.</p>
<p style="width: 500px;">Participant in the Labex-EFL program, within the “Typology and
dynamics of linguistic systems” strand. Since 2023, I also participate
in the “Computational Linguistics” strand. The goal is to address
computational linguistics through a cross-disciplinary approach,
enabling the deployment of NLP methods across the various linguistic
strands of the Labex.</p>
</div>
<h1 class="unnumbered"
id="publications-in-conferences-with-proceedings">Publications</h1>

<div class="refsection">
<p style="width: 500px;"> See my Google scholar profile : </p>
<a href="https://scholar.google.fr/citations?user=XYvTKbIAAAAJ&hl=fr&oi=ao">Maxime Fily</a>
</div>
<h1 class="unnumbered" id="ongoing-research-activity">Ongoing research
activity</h1>
<h2 class="unnumbered" id="comparative-work-on-na-dialects">Comparative
work on Na dialects</h2>
<p style="width: 500px;">I am currently in the field, collecting data to finalize a study on
tonal correspondences between Yongning Na and Shekua/Lataddi Na. The
study focuses on noun and verb categories in the two dialects. The main
difference between these two dialects lies in their tone systems:
Yongning has a three-level tone system, while Shekua/Lataddi has a
two-level tone system. Lexically, the two dialects are very close, and
moreover, they are mutually intelligible. This pushes for a tonal
comparison: our approach aims to identify precise tonal correspondences
between a two-level system and a three-level system, to potentially shed
light on rearrangement mechanisms. A preliminary study covered 52 verbs
and 51 nouns, and the objective is to expand this dataset to 100 verbs
and nouns for a comprehensive study. The challenge in this task arises
from the <em>morpho-tonological complexity</em> of tonal categories in
Na, which can only be accessed in combination with other morphemes. For
instance, verbs are examined in combination with TAME particles, nouns
within Object-Verb structures and compound nouns.</p>
<p style="width: 500px;">For this study, we utilize carefully curated data and rely on
statistical tools such as Sankey Diagrams and Agglomerative Hierarchical
Clustering. These tools offer evidence-based and interpretable graphs,
contributing to opening discussions on categories. Additionally, the
study draws on first-hand field data, all of which is accessible online
via the Pangloss website.</p>
<p style="width: 500px;">The results of the study mark a first step towards reconstructing the
diachronic processes that led to contemporary Na tone systems.
Simultaneously, a search for cognates is underway between Naish
languages and the conservative languages of the Trans-Himalayan
(Sino-Tibetan) family, particularly Gyalrong languages. This research is
conducted with guidance from Alexis Michaud (<span
class="smallcaps">Lacito</span>) and advice from Guillaume Jacques
(CRLAO). Furthermore, we closely follow the works of Li Zihe (Peking
University), who has undertaken the task of assembling that cognate list
in Naish, Gyalrong, Old Burmese, Old Tibetan and Written Tibetan
languages.</p>
<h2 class="unnumbered" id="nlp-for-underdocumented-languages">NLP for
un(der)documented languages</h2>
<p style="width: 500px;">My post-doctoral activities over the past year (and until August
2024) focus on evaluating the deployment cost of NLP methods for
un(der)documented languages. Our work is grounded in the understanding
that the rapid disappearance of rare and undocumented languages poses a
monumental challenge for linguists involved in documenting these
languages. Language models can assist linguists, but since they are
pre-trained on a closed list of resource-rich languages, they need to be
<em>fine-tuned</em> to be able to perform NLP tasks on a language not
seen in pre-training. Before I started this position, a successful
fine-tuning experiment was conducted on Yongning Na (glottocode:
yong1234) and Japhug (glottocode: japh1234), with Word-Error Rates well
within acceptable values for an Automatic Speech Recognition (ASR) task
(Guillaume, Wisniewski, Macaire, et al. 2022a). However, the study
revealed that the quality and diversity of the fine-tuning data impacted
the system’s performance. For instance, a system fine-tuned on
elicitations exhibited poor performance on texts, and vice versa.</p>
<p style="width: 500px;">To better design minimal sets of audio data for input into a language
model, specifically for ASR tasks – such as phonetic and phonological
transcriptions in our case –, my assignment consisted in developing
distance measurement methods in the vectorial representation space. This
allowed us to assess similarities between recordings. With the
assistance of the developed toolkit, our aim is to predict the
performance of a fine-tuning task based on the characteristics of the
input and tested materials.</p>
<p style="width: 500px;">This research is highly interdisciplinary, engaging both NLP
scientists seeking a better understanding of language models based on
neural networks and linguists who stand to benefit from language models
better tailored to their needs. Neural networks are producing
increasingly powerful models for academia, but the process needs to be
better understood, especially by the linguists, who have the expertise
to assess the accuracy of the language models produced recently. The
community can benefit from these models by accelerating time-consuming
stages of work, such as transcription. Thus, it seemed essential to
enter the NLP arena as a linguist. Guillaume Wisniewski (LLF) is the
principal investigator for the project, with a dedicated focus on the
rare languages of Yunnan and Sichuan, leaving ample room for future
collaborations within my area of linguistic interest.</p>
<p style="width: 500px;"><br />
</p>
<h1 class="unnumbered" id="academic-service">Academic service</h1>
<h2 class="unnumbered" id="article-reviews">Article reviews</h2>
<p style="width: 500px;">Peer-reviews in the field of phonetics and phonology, for the
following conferences: Interspeech, “Journées d’Étude de la Parole”.</p>
<h2 class="unnumbered" id="collective-duties">Collective duties</h2>
<div class="compactlabel">
<p style="width: 500px;"><span>2018 - 2022:</span></p>
<p style="width: 500px;">Representative for the doctoral students at <span
class="smallcaps">lacito</span>.</p>
<p style="width: 500px;">Representative for the doctoral students at <span
class="smallcaps">Gipsa-lab</span> and <span
class="smallcaps">uga</span>.</p>
<p style="width: 500px;">Representative for the doctoral students at <span
class="smallcaps">Grenoble-INP</span> <span
class="smallcaps">clhsct</span>.</p>
<p style="width: 500px;">Workplace First-Aid Rescuer (retraining planned in 2024).</p>
</div>
<p style="width: 500px;"><br />
</p>
<h1 class="unnumbered" id="learned-societies">Learned societies</h1>
<div class="compactlabel">
<p style="width: 500px;"><span>2018 - 2022:</span></p>
<p style="width: 500px;">Member of the “Association Française de Communication Parlée”.</p>
<p style="width: 500px;">Member of the “Société de Linguistique de Paris”.</p>
<p style="width: 500px;">Member of the International Phonetic Association.</p>
</div>
<h1 class="unnumbered" id="references">References</h1>
<div class="compactlabel">
<p style="width: 500px;"><span>Linguistics </span></p>
<p style="width: 500px;">M. Alexis Michaud<br />
Research Director, <span class="smallcaps">cnrs</span><br />
<span class="smallcaps">Lacito</span>, 7 rue G. Môquet, Villejuif<br />
Phone: (+33) (0)1 49 58 37 78<br />
E-mail: [email protected]<br />
</p>
<p style="width: 500px;">M. Guillaume Wisniewski<br />
Assistant Professor, Université Paris-Cité<br />
LLF, 8 rue A. Einstein, Paris<br />
Phone: (+33) (0)1 57 27 57 64<br />
E-mail: [email protected]<br />
</p>
</div>
</div>

0 comments on commit 4c1f4bf

Please sign in to comment.