GitHub - fchuffar/intro_r: An introduction to R

Branches Tags
Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
doc		doc
LICENSE		LICENSE
README.Rmd		README.Rmd
anadiff.Rmd		anadiff.Rmd
anadiff.html		anadiff.html
anadiff.xlsx		anadiff.xlsx
exercices.pdf		exercices.pdf
Repository files navigation

---
title: "R introduction"
author: "Florent Chuffart"
date: "`r Sys.Date()`"
output: 
  rmarkdown::html_document:
    toc: true
    toc_float: true
    toc_depth: 3
    number_sections: true
---

This R session is fully available on https://github.com/fchuffar/intro_r

# Purpose
This first R working session will be an introduction to your favourite statistical programming language: R. After a brief introduction to basics (user interface paradigms, free software and GNU), we will go deeply into the details of data structures, data indexing and data visualisation. The main purpose of this session is to offer a not only simple but also powerful alternative to your classical spreadsheet software. This introduction gives you the keys to perform large scale statistical data analysis.

This introduction follows three main goals: 
 
  * deploy a fonctionnal working environment for statistical analysis on personnal laptop of participants
  * design a first script allowing participants to import into R environment data from classical spreadsheet
  * define common ground of a discussion between life scientists and data analysts
  
  
# Prerequisistes


Chaque participant apportera son ordinateur portable sur lequel aura été installé les logiciels R et Rstudio :

 * Installer R (macosx https://cran.r-project.org/bin/macosx/ ) (windows https://cran.r-project.org/bin/windows/base/ )
 * Installer RStudio Desktop Open Source License https://www.rstudio.com/products/rstudio/download/

Le contenu de la session est disponible ici : https://github.com/fchuffar/intro_R
Il doit être téléchargé en cliquant sur le bouton vert en hautr à droite "Clone or Download", puis "Download ZIP".
La première séance se déroulera intégralement dans cette espace de travail.


# R Software


## GNU Public Licence

R is a free software environment for statistical computing and graphics.

More details about GNU, GPL and free software here:

  - https://www.gnu.org/philosophy/free-sw.html
  - https://en.wikipedia.org/wiki/GNU_General_Public_License
  - https://en.wikipedia.org/wiki/Free_software_movement
  - https://en.wikipedia.org/wiki/CeCILL

## R and biology

R is widely used in the life science community:

  - https://www.ncbi.nlm.nih.gov/pubmed/?term=r+software

## R Community

Many packages are freely available on:

  - CRAN https://cran.r-project.org
  - Bioconductor https://www.bioconductor.org
  - GitHub https://github.com

R users are organised as a community with many help topics available on the internet.

Many tutorial are available online: 

  - http://revue.sesamath.net/IMG/pdf/RCarte_Commandes-R.pdf
  - https://cran.r-project.org/doc/contrib/Kauffmann_aide_memoire_R.pdf
  - https://cran.r-project.org/doc/contrib/Paradis-rdebuts_fr.pdf

# R CLI vs. GUI

## CLI

R could be compared to an electronic calculator. It provides not only basic operations but also advanced features allowing to manipulate distributions and visualize data.

```{r}
1+1
2+2
pi
rnorm(100)
plot(density(rnorm(100)))
lines(density(rnorm(100, sd=2)), col=2)
lines(density(rnorm(100, sd=0.3, mean=1)), col=3)
```


## R GUI (RStudio) ***

https://www.rstudio.com

# Let's start with R!

## Working directory
```{r, eval=FALSE}
getwd()
# this command give the help on...
# ?getwd()
# ?rnorm
# ?setwd()
```

## Organisation of your workspace

- data
- results
- src
- doc


## From a spreadsheet to R data.frame ***

```{r}
#data_sh = read.table("data/seahorse_dataset2.csv", sep=";", header=TRUE, dec=".")
data_sh = read.table("data/sp_exp_grp.csv", sep=" ", header=TRUE, dec=".")
data_sh["GSE21749_P_WT_1",]
data_sh[,1]
head(data_sh)
tail(data_sh)
dim(data_sh)
summary(data_sh)

d = read.table("data/data_nutri.csv", header=TRUE, sep=",", row.names = 1)
head(d)
tail(d)
dim(d)
summary(d)




data_sh[1,]
data_sh[,1]

```


# Data 

## Affectation

```{r}
foo <- 1
foo

bar = "a"
bar = 'a'
bar
# bar = a
a = 4
a
bar = a
bar 

"b" -> baz
baz

quz <<- baz
```

## Common types

- numeric
- character
- logical (boolean)
- factor


```{r}
1
is.numeric(1)
a=1
is.numeric(a)
a="nom"
a
is.numeric(a)
0.5
is.numeric(0.5)
is.numeric("a")
foo = 1
is.numeric(foo)

is.integer(1)
a = as.integer(1)
a
is.integer(a)
as.integer(0.3)

as.integer(TRUE)
as.integer(FALSE)

as.numeric(TRUE)
as.numeric(FALSE)

"iab"
'toto'
is.numeric("a")
is.character("a")
is.character(1)
as.character(1)
as.numeric("a")

conc = c(0.1, 0.2, 0.3, "probleme de manip", 0.1)
conc = c(0.1, 0.2, 0.3, "probleme de manip", 0.1)
conc
as.numeric(conc)
conc2 = c(0.1, 0.2, 0.3, NA, 0.1)
conc2
mean(conc2)
mean(conc2, na.rm=TRUE)

NaN
log(0)
1/0



# logicals
TRUE
FALSE
T
F
T = 3
T = FALSE
T
# TRUE = 3
# NA = 3

is.numeric(TRUE)
as.numeric(TRUE)
as.numeric(FALSE)
as.logical(1)
as.logical(0)
as.logical(2)
as.logical(-2)
as.logical(-2.2)

is.logical("true")
is.character("true")
other_var = as.logical("true")
other_var
as.logical("True")

# factors
col_y = c("bleu", "bleu", "vert", "marron", "vert", "marron", "blue", " blue") 
col_y
f = as.factor(col_y)
f
col_y = c("bleu", "bleu", "vert", "marron", "vert", "marron", "bleu", "bleu") 
f = as.factor(col_y)
f
as.numeric(f)
```


## Set of data

|    | Homogeneous | Heterogeneous |
|----|-------------|---------------|
| 1d |  vector     | list          |
| 2d |  matrix     | data.frame    |

a data.frame is a list of same size vectors (colums).

```{r}
1:5
c(1,2,10)
c(1,"a",10)
c(TRUE,0)
v = rnorm(100)
v
v[c(5,18,39)]
v[98:100]
v[c(1,2,3)]
v[1:3]
h = v

v
v[c(1,2,3)]

list(1,"a",10)
l = list(chr="8", tx_start=124330091, tx_end=124410705, gs="ATAD2", ref="hg19", strand="-")
l
l$chr
c(tx_start=124330091, tx_end=124410705)



# matrix
m = matrix(runif(120),nrow=20)
dim(m)
m
m+1
m[20,6]
m[19:20,5:6]
m[c(17,20),]
m[,c(1,2)]
m[,c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE)]


data = data.frame(size=rnorm(25)+175, gender=sample(c("M", "F"), 25, replace=TRUE))
data[1:5,]
head(data)
data$gender

data[1,2] = "m"
data[1,2]
data[1,2] = "M"

data[data$gender=="M",]
data[data$gender=="M" & data$size > 170,]

factor(sample(c("10", "J", "Q", "K", "A"), 25, replace=TRUE))
factor(sample(c("10", "J", "Q", "K", "A"), 25, replace=TRUE), levels=c("10", "J", "Q", "K", "A"), ordered=TRUE)
```


## Selecting lines and columns of a dataset ***

```{r}
v = 100:130
v
v[1]
v[-1]
rev(v)[1]
v[1:4]
# index by the rank
v[c(1, 4, 10, 15)]
# index by mask
col_y
col_y[c(TRUE,FALSE, TRUE, TRUE, FALSE, TRUE)]

## select a column
# by name
data_sh$id
# by rank
data_sh[,2]
data_sh[,2:6]
data_sh[,c(FALSE,TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE, FALSE,FALSE,FALSE)]

```


# Visualization

```{r, eval=FALSE}
hist(data$size)
abline(v=mean(data$mean))
boxplot(size~gender, data)
qnorm(data$size)
qqline(data$size)
table(data$gender)
# scatter plot
barplot(data$gender)
# b,  l, t, r
par(mar=c(10,1,1,1))
```


# Tests

## Dataset

```{r}
# dataset nutri
data_nutri = read.table("data/data_nutri.csv", sep=",", header=TRUE, row.names=1)
head(data_nutri)
tail(data_nutri)
dim(data_nutri)
# Correlation between taille and poids
plot(data_nutri$taille, data_nutri$poids, col=data_nutri$sexe)
# create index for categ.
idx_h = data_nutri$sexe == "Homme"
idx_f = data_nutri$sexe == "Femme"
# plot density of each categ.
plot(density(data_nutri$taille[idx_h]), col=2, xlim=range(data_nutri$taille))
lines(density(data_nutri$taille[idx_f]))

boxplot(taille~sexe, data=data_nutri)
library(beanplot)
beanplot(taille~sexe, data=data_nutri)

library(beeswarm)
beeswarm(taille~sexe, data=data_nutri)

```
## Parametric
```{r}
# test de Shapiro-Wilks
# H0: normality
# H1: non normality
# alpha: 5%
norm_pop = rnorm(100)
plot(density(norm_pop))
shapiro.test(norm_pop)
# pval = 0.439 > alpha = 5% 
# on conserve H0 au risque beta à calculer... 
unif_pop = runif(100)
plot(density(unif_pop))
test = shapiro.test(unif_pop)
test$p.value
# pval = 0.0001 < alpha = 5%
# on rejette H0, on accepte H1 au risque de première espèce alpha = 5%

taille_h = data_nutri$taille[idx_h]
shapiro.test(taille_h)
# pval = 9.7% > alpha = 5%
# on conserve H0 au risque beta à calculer... beta contrôlé dans le cas de shapiro.test.

taille_f = data_nutri$taille[idx_f]
shapiro.test(taille_f)
# pval = 11% > alpha = 5%
# on conserve H0 au risque beta à calculer... beta contrôlé dans le cas de shapiro.test.


# Homoscedasticity
# H0: homoscedasticity
# H1: non homoscedasticity
# alpha: 5%

pop1 =   rnorm(100, 0, 1)
pop2 =   rnorm(100, 10, 1) 
plot(density(pop1), xlim=c(-5,15))
lines(density(pop2), col=2)

# homogénéité des variances
bartlett.test(
  c(pop1, pop2), 
  c(rep("a", 100), rep("b", 100))
)
# pval > 5% donc je conserve H0 au rosque beta à calculer

pop1 =   rnorm(100, 0, 0.1)
pop2 =   rnorm(100, 10, 1) 
plot(density(pop1), xlim=c(-5,15))
lines(density(pop2), col=2)
# non homogénéité des variances
bartlett.test(
  c(pop1, pop2), 
  c(rep("a", 100), rep("b", 100))
)
# pval < 5% donc je rejette H0 et j'accepte H1 au risque alpha de 5%.

bartlett.test(data_nutri$taille, data_nutri$sexe)
# pval = 16% donc je conserve H0 au seuil beta à calculer.


# Student test, t test, 
# H0: t_h == t_f
# H1: t_h != t_f
# alpha: 5%
t.test(taille_h, taille_f, conf.level = 0.95)
# pval = 2.2e-16 << 5% on rejette H0 et on accepte H1 au seuil alpha de 5%. 


# H0: t_h == t_f
# H1: t_h > t_f
# alpha: 5%
t.test(taille_h, taille_f, conf.level = 0.95, alternative="greater")
t.test(taille_h[1:10], taille_f[1:10], conf.level = 0.95, alternative="greater")
t.test(taille_h[1:10], taille_f[1:10], conf.level = 0.95, alternative="two.sided")
# pval = 2.2e-16 << 5% on rejette H0 et on accepte H1 au seuil alpha de 5%. 


# H0: t_h == t_f
# H1: t_h < t_f
# alpha: 5%
t.test(taille_h, taille_f, conf.level = 0.95, alternative="less")
# pval = 1 >> 5% on conserve H0 au risque beta, à calculer... 


```
# Non parametric
```{r}
# H0: t_h == t_f
# H1: t_h != t_f
# alpha: 5%
wilcox.test(taille_h[1:10], taille_f[1:10], alternative = "two.side") # Mann Withney
# pval = 0.0001 < alpha = 5% donc je rejette H0 et j'accepte H1 au seuil alpha de 5%


# H0: t_h == t_f
# H1: t_h > t_f
# alpha: 5%
test = wilcox.test(taille_h[1:10], taille_f[1:10], alternative = "greater") # Mann Withney
# pval = 0.00008 < alpha = 5% donc je rejette H0 et j'accepte H1 au seuil alpha de 5%



if (test$p.value < 0.001) {
label = paste(signif(test$p.value,2), "***")
} else if (test$p.value < 0.01) {
label = paste(signif(test$p.value,2), "**") 
} else if (test$p.value < 0.05) {
label = paste(signif(test$p.value,2), "*") 
} else {
label = signif(test$p.value,2)
}
plot(density(data_nutri$taille[idx_h]), col=2, xlim=range(data_nutri$taille))
lines(density(data_nutri$taille[idx_f]))
text(165,0.04, labels=label)

boxplot(data_nutri$taille~data_nutri$sex, las=2)
text(1.5,180, labels=label)

```

## Empirical p-value 


```{r}
stat_obs = wilcox.test(unique(taille_h)[1:5], unique(taille_f)[1:5], alternative = "two.side")$statistic # Mann Withney
 
stat_rnds = sapply(1:10000, function(i) {  
  rnd_pop = sample(c(unique(taille_h)[1:5], unique(taille_f)[1:5]))
  stat_rnd = wilcox.test(rnd_pop[1:5], rnd_pop[6:10], alternative = "two.side")$statistic # Mann Withney
  return(stat_rnd)
})

pval_emp = (sum(stat_rnds > stat_obs) + 1)/length(stat_rnds)
print(pval_emp)

plot(density(stat_rnds))
abline(v=stat_obs)
```

# Statistical test

```{r, eval=FALSE}
plot_null = function(n, mean=0, sd=1, thresh=TRUE, ...){
  foo = sapply(1:10000, function(i) {
    bar = rnorm(n, mean, sd)
    m = mean(bar)
    s = sd(bar)
    c(mean=m, sd = s)
  })
  foo = data.frame(t(foo))
  t = foo$mean / (foo$sd / sqrt(10))  
  lines(density(foo$mean), ...)
  if (thresh) abline(v=quantile(foo$mean, c(0.025,0.975)), lty=2, ...)
  # lines(density(t), lty=2, ...)
  # lines(density(rt(10000, n-1)), col="grey", lty=3)
}

layout(matrix(1:2, 1), respect=TRUE)
plot(0,0,col=0,xlim=c(-1,1), ylim=c(0,2))
plot_null(10)
plot_null(10,mean=0.25, thresh=FALSE, col=4)
# plot_null(20, col=2)

plot(0,0,col=0,xlim=c(-1,1), ylim=c(0,2))
plot_null(30)
plot_null(30,mean=0.25, thresh=FALSE, col=4)
# plot_null(20, col=2)



# H0: m == 0
# H1: m != 0
# alpha = P(reject H0 | H0 is true)
# beta = P(accept H0 | H0 is false)
```

# To go further with R

## Scripts

```{r, eval=FALSE}
source("intro_r.R")
```



## Functions

### Common functions

```{r}
x = c(1,2,3,4,5)
y = c(2,4,6,8,10)
length(x)
plot(x,y)

rnd = runif(100000)
dices = round(rnd * 6)
table(dices)
barplot(table(dices))
dices = floor(rnd * 6)
barplot(table(dices))
dices = ceiling(rnd * 6)
barplot(table(dices))
pie(table(dices))

mat_5_dices = matrix(dices,5)
dim(mat_5_dices)
mat_5_dices[,1:10]

sum_5_dices = apply(mat_5_dices, 2, sum)
table(sum_5_dices)
barplot(table(sum_5_dices))


prop.table(table(sum_5_dices))
prop.table(table(sum_5_dices))
sum(prop.table(table(sum_5_dices)))
```

```{r, eval=FALSE}
?ls()
?paste()

?sum()    
?cumsum() 
?sqrt()   

?rnorm()
?qnorm()
?pnorm()

?cbind()
?rbind()

?sort()
?order()
?grep()

?read.table()
?write.table()

```

### Custom functions

  - signature (parameter)
  - return
  - scope and global variables
  

```{r}

hello_world = function(name="world") {
  ret = paste("Hello ", name, "!", sep="")
  return(ret)  
}

hello_world()

```




# Practical session

## SRSF2 and SOX2 signatures in squamous cell lung carcinomas (data set suggested by Beatrice Eymin)

We dispose of 5 Affymetrix Human Genome U133A Array probe values for 130 samples
These 5 probes correspond to 2 genes:
 
  -  200753_x_at (SRSF)
  -  200754_x_at (SRSF)
  -  214882_s_at (SRSF)
  -  213721_at   (SOX2)
  -  213722_at   (SOX2)

We also dispose of clinical data for the 130 samples.

i) We want to check the correlation between probes
ii) We want to analyse the expressin of these genes according cancer stage

```{r}
# loading data
data = read.csv("data/GSE4573.csv", row.names=1)
head(data)
dim(data)

# correlation between probes
pairs((data[,1:5]))
pairs2 = function (...) {
    panel.cor = function(x, y, digits = 2, prefix = "", cex.cor, 
        ...) {
        usr = par("usr")
        on.exit(par(usr))
        par(usr = c(0, 1, 0, 1))
        r = abs(cor(x, y))
        txt = format(c(r, 0.123456789), digits = digits)[1]
        txt = paste(prefix, txt, sep = "")
        if (missing(cex.cor)) 
            cex = 0.8/strwidth(txt)
        test = cor.test(x, y)
        Signif = symnum(test$p.value, corr = FALSE, na = FALSE, 
            cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), symbols = c("***", 
                "**", "*", ".", " "))
        text(0.5, 0.5, txt, cex = cex * r)
        text(0.8, 0.8, Signif, cex = cex, col = 2)
    }
    pairs(..., upper.panel = panel.cor)
}
pairs2((data[,1:5]))


# distribution
layout(matrix(1:6, 2), respect=TRUE)
plot(density(data[,"X200753_x_at"]))
plot(density(data[,"X200754_x_at"]))
plot(density(data[,"X214882_s_at"]))
plot(density(data[,"X213721_at"  ]))
plot(density(data[,"X213722_at"  ]))


# stages
data$STAGE
idx_grp1 = data$STAGE %in% c("Ia", "Ib", "IIa", "IIb")
idx_grp2 = data$STAGE %in% c("IIIa", "IIIb")
data$stage_grps = NA
data[idx_grp1,]$stage_grps = "grp1"
data[idx_grp2,]$stage_grps = "grp2"


layout(matrix(1:6, 2), respect=TRUE)
beanplot::beanplot(X200753_x_at~stage_grps, data)
beanplot::beanplot(X200754_x_at~stage_grps, data)
beanplot::beanplot(X214882_s_at~stage_grps, data)
beanplot::beanplot(X213721_at  ~stage_grps, data)
beanplot::beanplot(X213722_at  ~stage_grps, data)

t.test(data[idx_grp1, "X200753_x_at"], data[idx_grp2, "X200753_x_at"])
t.test(data[idx_grp1, "X200754_x_at"], data[idx_grp2, "X200754_x_at"])
t.test(data[idx_grp1, "X214882_s_at"], data[idx_grp2, "X214882_s_at"])
t.test(data[idx_grp1, "X213721_at"  ], data[idx_grp2, "X213721_at"  ])
t.test(data[idx_grp1, "X213722_at"  ], data[idx_grp2, "X213722_at"  ])

wilcox.test(data[idx_grp1, "X200753_x_at"], data[idx_grp2, "X200753_x_at"])
wilcox.test(data[idx_grp1, "X200754_x_at"], data[idx_grp2, "X200754_x_at"])
wilcox.test(data[idx_grp1, "X214882_s_at"], data[idx_grp2, "X214882_s_at"])
wilcox.test(data[idx_grp1, "X213721_at"  ], data[idx_grp2, "X213721_at"  ])
wilcox.test(data[idx_grp1, "X213722_at"  ], data[idx_grp2, "X213722_at"  ])


layout(matrix(1:6, 2), respect=TRUE)
beanplot::beanplot(X200753_x_at~SEX, data)
beanplot::beanplot(X200754_x_at~SEX, data)
beanplot::beanplot(X214882_s_at~SEX, data)
beanplot::beanplot(X213721_at  ~SEX, data)
beanplot::beanplot(X213722_at  ~SEX, data)


layout(matrix(1:6, 2), respect=TRUE)
plot(data[,"X200753_x_at"], data$AGE)
plot(data[,"X200754_x_at"], data$AGE)
plot(data[,"X214882_s_at"], data$AGE)
plot(data[,"X213721_at"  ], data$AGE)
plot(data[,"X213722_at"  ], data$AGE)

layout(matrix(1:6, 2), respect=TRUE)
beanplot::beanplot(X200753_x_at~T, data)
beanplot::beanplot(X200754_x_at~T, data)
beanplot::beanplot(X214882_s_at~T, data)
beanplot::beanplot(X213721_at  ~T, data)
beanplot::beanplot(X213722_at  ~T, data)




```



  


 

## Control structure

A control structure is a block of programming that analyzes variables and chooses a direction in which to go based on given parameters.
  


## Algorithm

  An algorithm is a self-contained sequence of actions to be performed. 
  An algorithm is an effective method that can be expressed within a finite amount of space and time.

# References


http://adv-r.had.co.nz

http://r-pkgs.had.co.nz

# Exercices 

http://graal.ens-lyon.fr/~fchuffart/files/data/exercices.pdf



# RegLog


```{r}
d = data_nutri
layout(matrix(1:2, 1), respect=TRUE)

plot(d$taille, d$poids)
# Y~B
# E(Y) = b.X
# E(Y) = b_0 + b_1.X
# Y_i = b_0 + b_1.X_i + e_i
m = lm(d$poids~d$taille)
m = lm(poids~taille, d)
m$coefficients
m$residuals
abline(a=m$coefficients[[1]], b=m$coefficients[[2]], col=2)
# arrows(d$taille, d$poids, d$taille+m$residuals, d$poids, col=adjustcolor(1, alpha.f=0.5))
arrows(d$taille, d$poids, d$taille, d$poids-m$residuals, col=adjustcolor(1, alpha.f=0.5))



layout(matrix(1:2, 1), respect=TRUE)
plot(d$taille, d$poids)
# Y~B
# E(Y) = b.X
# E(Y) = b_0 + b_1.X
# Y_i = b_0 + b_1.X_i + e_i
m = lm(d$poids~d$taille)
m$coefficients
abline(a=m$coefficients[[1]], b=m$coefficients[[2]], col=2)
plot(d$taille, d$poids-m$residuals, col=1)
arrows(d$taille, d$poids, d$taille, d$poids-m$residuals, col=adjustcolor(1, alpha.f=0.5))








d = data_nutri
layout(matrix(1:2, 1), respect=TRUE)

plot(d$sex, d$poids)
# Y~B
# E(Y) = b.X
# E(Y) = b_0 + b_1.X
# Y_i = b_0 + b_1.X_i + e_i
m = lm(d$poids~d$sex)
m$coefficients
abline(a=m$coefficients[[1]], b=m$coefficients[[2]], col=2)
plot(1:nrow(d), d$poids-m$residuals, col=1)
arrows(1:nrow(d), d$poids, 1:nrow(d), d$poids-m$residuals, col=adjustcolor(1, alpha.f=0.5))










abline(a=m$coefficients[[1]], b=m$coefficients[[2]], col=2)


arrows(d$taille, d$poids, d$taille + 0.5*m$coefficients[[2]]*m$residuals, d$poids - sqrt(m$coefficients[[2]]*m$residuals), col=adjustcolor(1, alpha.f=0.5))


arrows(d$taille, d$poids, d$taille + 0.5*m$coefficients[[2]]*m$residuals, d$poids - sqrt(m$coefficients[[2]]*m$residuals), col=adjustcolor(1, alpha.f=0.5))

# plot( d$poids, d$taille)
# m = lm(d$taille~d$poids)
# m = lm(taille~poids, d)
# m$coefficients
# m$residuals
# arrows(d$poids, d$taille, d$poids+m$residuals, d$taille, col=adjustcolor(1, alpha.f=0.5))
# abline(a=m$coefficients[[1]], b=m$coefficients[[2]], col=2)
#

plot(density(m$residuals))

plot(data_nutri$sexe, data_nutri$taille)


```
# Session Information

```{r, results="verbatim"}
sessionInfo()
```