-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathatom.xml
249 lines (209 loc) · 30.4 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title><![CDATA[a statistic dog]]></title>
<subtitle><![CDATA[路漫漫其修远兮,吾将上下而迷路]]></subtitle>
<link href="/atom.xml" rel="self"/>
<link href="http://www.xiejiachen.xyz/"/>
<updated>2015-12-13T23:07:39.000Z</updated>
<id>http://www.xiejiachen.xyz/</id>
<author>
<name><![CDATA[五三模拟比英语周报更适合垫锅]]></name>
</author>
<generator uri="http://hexo.io/">Hexo</generator>
<entry>
<title><![CDATA[构建函数]]></title>
<link href="http://www.xiejiachen.xyz/2015/12/13/%E6%9E%84%E5%BB%BA%E5%87%BD%E6%95%B0/"/>
<id>http://www.xiejiachen.xyz/2015/12/13/构建函数/</id>
<published>2015-12-13T21:47:42.000Z</published>
<updated>2015-12-13T23:07:39.000Z</updated>
<content type="html"><![CDATA[<p>在R中,除了使用系统自带的(read.csv, mean等),或者包中包含的函数(ggpl等ot),也可以自己写入函数。通过自己写函数,可以让算法更灵活,毕竟包并不总能解决所有的问题。</p>
<a id="more"></a>
<p>写入函数的语法,与其它语言类似,但是更简单。<blockquote class="pullquote [right]"><p>“Use your head,nut follow your heart”–Nancy Rothwell</p>
</blockquote>如果你学过C++或其它的语言,应该可以很快的上手和熟悉。</p>
<p>比如C++写入一个加法函数</p>
<figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">add2</span><span class="params">(<span class="keyword">int</span> a,b)</span></span>{</span><br><span class="line"> x= a+b;</span><br><span class="line"> <span class="keyword">return</span> x;</span><br><span class="line"> }</span><br></pre></td></tr></table></figure>
<ul>
<li>int表示返回的数据值类型</li>
<li>add2表示函数名称</li>
<li>int (a,b)表示需要的参数。</li>
</ul>
<p> 同样的,在 R里也可以写一个类似的加法函数。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">add2 <- <span class="keyword">function</span>(x,y){</span><br><span class="line"> x+y</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>试试用 <code>add2(1+2)</code> 来调用它吧</p>
<p>又或者,你可以写一个函数来找出向量中的一些特定数值。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">above<- <span class="keyword">function</span>(x,n){</span><br><span class="line"> use<- x>n</span><br><span class="line"> x[use]</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>调用这个函数可以找出向量x中大于n的元素啦。<br>比如,想找出1到10的整数向量中大于5的元素。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">x<- c(<span class="number">1</span>:<span class="number">10</span>)</span><br><span class="line">above(x,<span class="number">5</span>)</span><br></pre></td></tr></table></figure>
<hr>
<p>熟悉了这些基本语法以后,就可以处理一些更复杂的问题。<br>我们来看两道例题(来自coursera-r-programing-prog035)。<br>先<a href="https://d396qusza40orc.cloudfront.net/rprog/data/specdata.zip" target="_blank" rel="external">下载数据</a>。</p>
<p>数据中包含了332个CSV文件。这是美国的332个空气检车站分别收集的监测数据,比如”001.csv”表示001号监测站的观测数据.</p>
<p>每个数据中包含了3个变量:<br>分别是:</p>
<ul>
<li>日期: 观测的时间</li>
<li>sulfate: 硫化物含量</li>
<li>nitrate: 硝酸盐含量</li>
</ul>
<p>现在请写两个函数:</p>
<ol>
<li>输入 (文件夹名称,污染物名称,机器号) 得到这些机器号的污染物的平均值</li>
<li>输入 (文件夹名称,序列号) 得到一个二维数组包括机器号和其对应的完整数据的数量</li>
</ol>
<script src="https://gist.github.com/xiejiachen/885b62cc5dc5eb84eeba.js"></script>
<p><a href="https://github.com/xiejiachen/R-python-codewithmyweb/tree/master/R/4_function" target="_blank" rel="external">欢迎访问我的github下载数据和代码</a></p>
]]></content>
<summary type="html">
<![CDATA[<p>在R中,除了使用系统自带的(read.csv, mean等),或者包中包含的函数(ggpl等ot),也可以自己写入函数。通过自己写函数,可以让算法更灵活,毕竟包并不总能解决所有的问题。</p>]]>
</summary>
</entry>
<entry>
<title><![CDATA[Welcome to Statistic Dog Home]]></title>
<link href="http://www.xiejiachen.xyz/2015/12/11/test/"/>
<id>http://www.xiejiachen.xyz/2015/12/11/test/</id>
<published>2015-12-11T23:27:40.000Z</published>
<updated>2015-12-11T23:27:40.000Z</updated>
<content type="html"><![CDATA[<p>-----------------写一些 $R$ 跟 $python$ 的杂文。<br><a id="more"></a></p>
]]></content>
<summary type="html">
<![CDATA[<p>-----------------写一些 $R$ 跟 $python$ 的杂文。<br>]]>
</summary>
<category term="University of York" scheme="http://www.xiejiachen.xyz/tags/University-of-York/"/>
<category term="约克大学" scheme="http://www.xiejiachen.xyz/tags/%E7%BA%A6%E5%85%8B%E5%A4%A7%E5%AD%A6/"/>
<category term="网站介绍" scheme="http://www.xiejiachen.xyz/tags/%E7%BD%91%E7%AB%99%E4%BB%8B%E7%BB%8D/"/>
<category term="写在前面" scheme="http://www.xiejiachen.xyz/categories/%E5%86%99%E5%9C%A8%E5%89%8D%E9%9D%A2/"/>
</entry>
<entry>
<title><![CDATA[apply循环与数据类型]]></title>
<link href="http://www.xiejiachen.xyz/2015/12/11/%E5%86%85%E7%BD%AE%E5%BE%AA%E7%8E%AF/"/>
<id>http://www.xiejiachen.xyz/2015/12/11/内置循环/</id>
<published>2015-12-11T12:10:13.000Z</published>
<updated>2015-12-13T21:46:20.000Z</updated>
<content type="html"><![CDATA[<p>在R中,除了前面提到的通用循环结构。也可以使用R内置的一些的循环函数 。这些循环函数可以大大简化代码量,使得代码清晰易懂。<br><a id="more"></a><br><!-- toc --></p>
<hr>
<h2 id="u5185_u7F6E_u5FAA_u73AF_u51FD_u6570"><a href="#u5185_u7F6E_u5FAA_u73AF_u51FD_u6570" class="headerlink" title="内置循环函数"></a>内置循环函数</h2><h3 id="lapply_28list_apply_29_3A"><a href="#lapply_28list_apply_29_3A" class="headerlink" title="lapply(list apply):"></a>lapply(list apply):</h3><p>对列表每个元素应用函数,并返回列表。(Loop over a list and evaluate a function on each element) </p>
<h4 id="u7528_u6CD5"><a href="#u7528_u6CD5" class="headerlink" title="用法"></a>用法</h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">lapply (X, FUN, <span class="keyword">...</span>)</span><br></pre></td></tr></table></figure>
<h4 id="u53C2_u6570_u8BF4_u660E_uFF1A"><a href="#u53C2_u6570_u8BF4_u660E_uFF1A" class="headerlink" title="参数说明:"></a>参数说明:</h4><p>X表示列表,FUN函数, …给函数返回的参数值</p>
<h4 id="u8BD5_u7740_u8F93_u5165"><a href="#u8BD5_u7740_u8F93_u5165" class="headerlink" title="试着输入"></a>试着输入</h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">data1= lapply(<span class="number">1</span>:<span class="number">3</span>, <span class="keyword">function</span>(i) c(<span class="number">1</span>:<span class="number">2</span>))</span><br></pre></td></tr></table></figure>
<p>返回一个包含三个列表的<strong>列表</strong>,每个列表都包含元素1和2。</p>
<h3 id="sapply_28simplify_apply_29_3A"><a href="#sapply_28simplify_apply_29_3A" class="headerlink" title="sapply(simplify apply):"></a>sapply(simplify apply):</h3><p>lapply的一个变种,尽可能简化lapply。但是如果它不知道该怎么简时,它会什么也不做,直接返回一个list。(Same as lapply but to simplify the result)</p>
<h4 id="u7528_u6CD5-1"><a href="#u7528_u6CD5-1" class="headerlink" title="用法"></a>用法</h4><pre><code>sapply (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
</code></pre><h4 id="u8BD5_u7740_u8F93_u5165-1"><a href="#u8BD5_u7740_u8F93_u5165-1" class="headerlink" title="试着输入"></a>试着输入</h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">data2= sapply(<span class="number">1</span>:<span class="number">3</span>, <span class="keyword">function</span>(i) c(<span class="number">1</span>:<span class="number">2</span>))</span><br></pre></td></tr></table></figure>
<p>返回了一个2行3列的数据框</p>
<h3 id="mapply_28multi_list_apply_29_3A"><a href="#mapply_28multi_list_apply_29_3A" class="headerlink" title="mapply(multi list apply):"></a>mapply(multi list apply):</h3><p>把函数应用到对象(multivariate version of lapply)</p>
<h4 id="u7528_u6CD5-2"><a href="#u7528_u6CD5-2" class="headerlink" title="用法"></a>用法</h4><pre><code>mapply (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,
USE.NAMES = TRUE)
</code></pre><h4 id="u8BD5_u7740_u8F93_u5165-2"><a href="#u8BD5_u7740_u8F93_u5165-2" class="headerlink" title="试着输入"></a>试着输入</h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">data3= mapply( <span class="keyword">function</span>(i) c(<span class="number">1</span>:<span class="number">2</span>), <span class="number">1</span>:<span class="number">3</span>)</span><br></pre></td></tr></table></figure>
<p>也返回了一个2行3列的数据框,要先写函数。因为默认simplify=true,如果你想要像lapply一样得到一个列表 ,只要让SIMPLIFY=FALSE即可<br><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">data4list= mapply( <span class="keyword">function</span>(i) c(<span class="number">1</span>:<span class="number">2</span>), <span class="number">1</span>:<span class="number">3</span>, SIMPLIFY = <span class="literal">FALSE</span>)</span><br></pre></td></tr></table></figure></p>
<h3 id="apply_3A"><a href="#apply_3A" class="headerlink" title="apply:"></a>apply:</h3><p>对有维度的数组的边进行操作,常常用来对矩阵的行或者列计算apply a function over the margins of an arry.</p>
<h4 id="u7528_u6CD5-3"><a href="#u7528_u6CD5-3" class="headerlink" title="用法"></a>用法</h4><pre><code>apply (X, MARGIN, FUN, ...)
</code></pre><h4 id="u8BD5_u7740_u8F93_u5165-3"><a href="#u8BD5_u7740_u8F93_u5165-3" class="headerlink" title="试着输入"></a>试着输入</h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">x<-matrix(rnorm(<span class="number">200</span>),<span class="number">20</span>,<span class="number">10</span>)</span><br><span class="line">apply(x,<span class="number">1</span>,mean)</span><br><span class="line">apply(x,<span class="number">2</span>,mean)</span><br></pre></td></tr></table></figure>
<h2 id="u6570_u636E_u7C7B_u578B_u7684_u533A_u522B_uFF1A"><a href="#u6570_u636E_u7C7B_u578B_u7684_u533A_u522B_uFF1A" class="headerlink" title="数据类型的区别:"></a>数据类型的区别:</h2><h3 id="vector__u5411_u91CF_uFF1A"><a href="#vector__u5411_u91CF_uFF1A" class="headerlink" title="vector 向量:"></a>vector 向量:</h3><p>一维向量,只能包含同种类型元素</p>
<h3 id="matrix__u77E9_u9635_uFF1A"><a href="#matrix__u77E9_u9635_uFF1A" class="headerlink" title="matrix 矩阵:"></a>matrix 矩阵:</h3><p>二维,只能包含同种类型元素</p>
<h3 id="data-frame__u6570_u636E_u6846_3A"><a href="#data-frame__u6570_u636E_u6846_3A" class="headerlink" title="data.frame 数据框:"></a>data.frame 数据框:</h3><p>可以包含不同类型元素,而且可以给每列命名,并可以使用framename[colname]调出列。使用read.csv read.table之类的函数得到的数据就储存在数据框里</p>
<h3 id="list__u5217_u8868_uFF1A"><a href="#list__u5217_u8868_uFF1A" class="headerlink" title="list 列表:"></a>list 列表:</h3><p>列表的每个元素可以是任何,可以是向量,可以是矩阵,可以是数据框,甚至可以是列表。在Rstudio里右上角的环境区,用鼠标点击 data3跟data4观察list与dataframe的区别</p>
]]></content>
<summary type="html">
<![CDATA[<p>在R中,除了前面提到的通用循环结构。也可以使用R内置的一些的循环函数 。这些循环函数可以大大简化代码量,使得代码清晰易懂。<br>]]>
</summary>
<category term="R" scheme="http://www.xiejiachen.xyz/tags/R/"/>
<category term="循环" scheme="http://www.xiejiachen.xyz/tags/%E5%BE%AA%E7%8E%AF/"/>
<category term="R" scheme="http://www.xiejiachen.xyz/categories/R/"/>
</entry>
<entry>
<title><![CDATA[$R$ 循环]]></title>
<link href="http://www.xiejiachen.xyz/2015/12/11/R-%E5%BE%AA%E7%8E%AF/"/>
<id>http://www.xiejiachen.xyz/2015/12/11/R-循环/</id>
<published>2015-12-11T01:54:23.000Z</published>
<updated>2015-12-11T02:20:02.000Z</updated>
<content type="html"><![CDATA[<p>在R中,也可以使用循环语句进行结构控制。<br>R的循环结构与其它的语法$(C++, python)$非常相似。<br><a id="more"></a><br><!-- toc --></p>
<h2 id="If__u5FAA_u73AF"><a href="#If__u5FAA_u73AF" class="headerlink" title="If 循环"></a>If 循环</h2><h3 id="if"><a href="#if" class="headerlink" title="if"></a>if</h3><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">If(condition){</span><br><span class="line"><span class="comment">##do something}</span></span><br></pre></td></tr></table></figure>
<h3 id="if-else"><a href="#if-else" class="headerlink" title="if.else"></a>if.else</h3><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">If(condition) {</span><br><span class="line"> <span class="comment">##do something</span></span><br><span class="line">}Else{</span><br><span class="line"> <span class="comment">##do someting</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<h3 id="if-_else_if-_else"><a href="#if-_else_if-_else" class="headerlink" title="if. else if. else"></a>if. else if. else</h3><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">If(condition) {</span><br><span class="line"> <span class="comment">##do something</span></span><br><span class="line">}Else <span class="keyword">if</span>{</span><br><span class="line"> <span class="comment">##do someting</span></span><br><span class="line">}Else{</span><br><span class="line"> <span class="comment">##do someting</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>注意:else必须跟前面的<code>}</code>在同一行。因为R语言执行程序时,每行单独执行,所以else如果在下一行,R会认为语句结束,导致报错。</p>
<h2 id="For__u5FAA_u73AF"><a href="#For__u5FAA_u73AF" class="headerlink" title="For 循环"></a>For 循环</h2><p>for循环表示,i从1到10,依次执行。<br><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span>(i <span class="keyword">in</span> <span class="number">1</span>:<span class="number">10</span>){</span><br><span class="line"> print(i)</span><br><span class="line"> <span class="comment">##do something</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure></p>
<h2 id="While__u5FAA_u73AF"><a href="#While__u5FAA_u73AF" class="headerlink" title="While 循环"></a>While 循环</h2><p>while循环表示在给定的条件下无限次执行,直到不符合条件停止。<br><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">x<- <span class="number">0</span></span><br><span class="line"><span class="keyword">while</span>(x<<span class="number">10</span>){</span><br><span class="line"> print(x)</span><br><span class="line"> x<- x+<span class="number">1</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure></p>
<h2 id="Repeat__u5FAA_u73AF"><a href="#Repeat__u5FAA_u73AF" class="headerlink" title="Repeat 循环"></a>Repeat 循环</h2><p>使用repeat创建一个无限的循环。只有执行到<strong>break</strong>循环才会停止<br><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">repeat</span> {</span><br><span class="line"> x1<- x1/<span class="number">10</span></span><br><span class="line"> </span><br><span class="line"> <span class="keyword">if</span>(abs(x1-x0)<tol){</span><br><span class="line"> <span class="keyword">break</span></span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> x0<-x1</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure></p>
<h2 id="Next_2C_Return"><a href="#Next_2C_Return" class="headerlink" title="Next, Return"></a>Next, Return</h2><p>next表示跳过。<br><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span>(i <span class="keyword">in</span> <span class="number">1</span>:<span class="number">10</span>) { <span class="keyword">if</span>(i <= <span class="number">3</span>) {</span><br><span class="line"> <span class="comment">## Skip the first 20 iterations</span></span><br><span class="line"> <span class="keyword">next</span></span><br><span class="line">}</span><br><span class="line"> print(i)</span><br><span class="line">}</span><br></pre></td></tr></table></figure></p>
<p>return表示停止循环,并返回一个值。与C++类似。</p>
<p><a href="https://github.com/xiejiachen/R-python-codewithmyweb/tree/master/R/3_structure" target="_blank" rel="external">来我的github查看示例代码</a></p>
]]></content>
<summary type="html">
<![CDATA[<p>在R中,也可以使用循环语句进行结构控制。<br>R的循环结构与其它的语法$(C++, python)$非常相似。<br>]]>
</summary>
<category term="R" scheme="http://www.xiejiachen.xyz/tags/R/"/>
<category term="循环" scheme="http://www.xiejiachen.xyz/tags/%E5%BE%AA%E7%8E%AF/"/>
<category term="R" scheme="http://www.xiejiachen.xyz/categories/R/"/>
</entry>
<entry>
<title><![CDATA[$R$ 获取子集]]></title>
<link href="http://www.xiejiachen.xyz/2015/12/10/R-%E8%8E%B7%E5%8F%96%E5%AD%90%E9%9B%86/"/>
<id>http://www.xiejiachen.xyz/2015/12/10/R-获取子集/</id>
<published>2015-12-10T22:45:03.000Z</published>
<updated>2015-12-11T01:33:05.000Z</updated>
<content type="html"><![CDATA[<p>想象一下,如果你只想研究数据中的部分数据。比如从我们的<a href="https://github.com/xiejiachen/R-python-codewithmyweb/tree/master/R/2_subset" target="_blank" rel="external">练习数据</a>中找出 ozone大于31且Temp大于90时, Soloar的平均值。 或者5月份时,ozone的最大值,该怎么做?一个方法是,获取它们的子集。</p>
<a id="more"></a>
<!-- toc -->
<h4 id="1-_subset__u51FD_u6570"><a href="#1-_subset__u51FD_u6570" class="headerlink" title="1. <strong>subset</strong> 函数"></a>1. <strong>subset</strong> 函数</h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">subset(x, subset, select)</span><br></pre></td></tr></table></figure>
<p>可以使用subset函数获取子集</p>
<h5 id="u53C2_u6570_u8BF4_u660E"><a href="#u53C2_u6570_u8BF4_u660E" class="headerlink" title="参数说明"></a>参数说明</h5><ul>
<li>x: 数据名/ dataframe</li>
<li>subset: 限制条件</li>
<li>select: 选取行或列</li>
</ul>
<p>比如要找出 ozone大于31且Temp大于90时, Soloar的平均值。<br>可以使用<br><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">data<-read.csv(<span class="string">"subsetdata.csv"</span>)</span><br><span class="line">summary(subset(data, Ozone><span class="number">31</span>&Temp><span class="number">90</span>))</span><br></pre></td></tr></table></figure></p>
<h4 id="2-__u4F7F_u7528_5B_5D_u8868_u793A_u5B50_u96C6"><a href="#2-__u4F7F_u7528_5B_5D_u8868_u793A_u5B50_u96C6" class="headerlink" title="2. 使用[]表示子集"></a>2. 使用[]表示子集</h4><p>注意在一个data.frame使用[]表示一个dataframe。<br>[[]]表示vector.<br>data.frame与vector的区别:vector中的变量必须是同一数据类型。</p>
<p>回到刚才的问题,如果想找出5月份ozone的最大值,我们可以:<br><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">summary(data[data$Month == <span class="number">5</span>,])</span><br></pre></td></tr></table></figure></p>
<pre><code>data[data$Month == 5,] 表示月份等于5的子集。
</code></pre><p>注意:这里我不用<strong>mean</strong>函数的原因是,<strong>mean</strong>函数必须是同一个数据类型。而这里数据中含有NA,使用mean函数会报错。而<strong>summary</strong>函数只会计算完整的数据类型。</p>
<p><a href="https://github.com/xiejiachen/R-python-codewithmyweb/tree/master/R/2_subset" target="_blank" rel="external">访问我的github下载练习数据和代码</a></p>
]]></content>
<summary type="html">
<![CDATA[<p>想象一下,如果你只想研究数据中的部分数据。比如从我们的<a href="https://github.com/xiejiachen/R-python-codewithmyweb/tree/master/R/2_subset">练习数据</a>中找出 ozone大于31且Temp大于90时, Soloar的平均值。 或者5月份时,ozone的最大值,该怎么做?一个方法是,获取它们的子集。</p>]]>
</summary>
<category term="R" scheme="http://www.xiejiachen.xyz/tags/R/"/>
<category term="子集" scheme="http://www.xiejiachen.xyz/tags/%E5%AD%90%E9%9B%86/"/>
<category term="R" scheme="http://www.xiejiachen.xyz/categories/R/"/>
</entry>
<entry>
<title><![CDATA[如何用 $R$ 读取数据和数据交互]]></title>
<link href="http://www.xiejiachen.xyz/2015/12/10/readdata/"/>
<id>http://www.xiejiachen.xyz/2015/12/10/readdata/</id>
<published>2015-12-10T22:18:55.000Z</published>
<updated>2015-12-11T03:45:23.000Z</updated>
<content type="html"><![CDATA[<p>R自带了一些可以直接调用的函数,可以用来读取数据。<br>如:<br> <strong>read.table</strong><br> <strong>read.csv</strong><br> <strong>read.csv2</strong><br><a id="more"></a></p>
<!-- toc -->
<h3 id="read-table"><a href="#read-table" class="headerlink" title="read.table"></a>read.table</h3><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">read.table(file, header = <span class="literal">FALSE</span>, sep = <span class="string">""</span>, dec = <span class="string">"."</span>, colClasses = <span class="literal">NA</span>, comment.char = <span class="string">"#"</span>, skip = <span class="number">0</span>)</span><br></pre></td></tr></table></figure>
<h4 id="u53C2_u6570_u8BF4_u660E_3A"><a href="#u53C2_u6570_u8BF4_u660E_3A" class="headerlink" title="<strong>参数说明</strong>:"></a><strong>参数说明</strong>:</h4><ul>
<li>file: 文件来源,名字,链接,或计算机的存储路径</li>
<li>Header: 第一行是否包含变量名, TRUE 包含,FALSE不包含</li>
<li>sep : 数字之间用什么分隔, 默认值为sep=” “。比如,在读取数据时,8 9表示两个数字。若数据文件中使用逗号分隔数字,如8,9 则需使sep=”,”</li>
<li>Dec: 小数点表示方法。 </li>
<li>colClasses: 表示每列的类,比如你可以指定第一列是logistical, 第二列是factor,第三列是integer。设置colClasses的好处是可以节约计算机的内存,不然R会边读取边计算。 在处理大样本时有必要设置来提高效率。</li>
<li>comment.char:表明文件中用于注释的字符,读取时会跳过注释符号后的行</li>
<li>skip: 跳过。 有时文件开头有一些不需要的数据或文字,可以指定skip跳过这些行,然后从那里读取数据。</li>
<li>stringsAsFactors: 读取数据时遇到字符串的处理办法。 TRUE表示自动将那一列作为factor, false表示为字符串。</li>
</ul>
<p>读取数据时,不填写参数表示使用默认值。</p>
<p>R还有一些默认的其它的读取命令:如 <strong>read.csv, read.csv2</strong><br>CSV(Comma Separated Value): 表示用逗号分隔的数据。<br>这里<strong>read.csv</strong>,<strong>read.csv2</strong> 与<strong>read.table</strong>几乎是一样的,可以通过填写参数使他们达到同样的效果区别是默认参数值。</p>
<h4 id="read-csv"><a href="#read-csv" class="headerlink" title="<strong>read.csv</strong>"></a><strong>read.csv</strong></h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">read.csv(file, header = <span class="literal">TRUE</span>, sep = <span class="string">","</span> , dec = <span class="string">"."</span>, fill = <span class="literal">TRUE</span>, comment.char = <span class="string">""</span>)</span><br></pre></td></tr></table></figure>
<h4 id="read-csv2"><a href="#read-csv2" class="headerlink" title="<strong>read.csv2</strong>"></a><strong>read.csv2</strong></h4><figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">read.csv2(file, header = <span class="literal">TRUE</span>, sep = <span class="string">";"</span>, dec = <span class="string">","</span>, fill = <span class="literal">TRUE</span>, comment.char = <span class="string">""</span>)</span><br></pre></td></tr></table></figure>
<hr>
<h3 id="u6570_u636E_u4EA4_u4E92"><a href="#u6570_u636E_u4EA4_u4E92" class="headerlink" title="数据交互"></a>数据交互</h3><p>除了使用默认的function, 也可以使用一些其它的方法来让R与其它文件数据交互。<br>比如可以使用openfile:</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">suibian <- file(<span class="string">"readdata.csv"</span>, <span class="string">"r"</span>) </span><br><span class="line">data <- read.csv(suibian) </span><br><span class="line">close(suibian)</span><br></pre></td></tr></table></figure>
<h4 id="openfile__u53C2_u6570"><a href="#openfile__u53C2_u6570" class="headerlink" title="openfile 参数"></a>openfile 参数</h4><ul>
<li>“r”只读: 参数’r’与<code>read.csv("readdata.csv")</code>效果相同</li>
<li>“w”覆写: 删除原来的数据, 并重新写入</li>
<li>“a”追写: 在文件的最后写入</li>
</ul>
<p>这个语法与 python, c++类似。</p>
<p>注意每次打开数据都要记得使用close关闭它, 因为打开以后是存储在内存中的,不关闭可能造成数据错误。</p>
<p><a href="https://github.com/xiejiachen/R-python-codewithmyweb/tree/master/R/R1_readdata" target="_blank" rel="external">访问我的github查看代码例子</a></p>
]]></content>
<summary type="html">
<![CDATA[<p>R自带了一些可以直接调用的函数,可以用来读取数据。<br>如:<br> <strong>read.table</strong><br> <strong>read.csv</strong><br> <strong>read.csv2</strong><br>]]>
</summary>
<category term="R" scheme="http://www.xiejiachen.xyz/tags/R/"/>
<category term="数据交互" scheme="http://www.xiejiachen.xyz/tags/%E6%95%B0%E6%8D%AE%E4%BA%A4%E4%BA%92/"/>
<category term="读取数据" scheme="http://www.xiejiachen.xyz/tags/%E8%AF%BB%E5%8F%96%E6%95%B0%E6%8D%AE/"/>
<category term="R" scheme="http://www.xiejiachen.xyz/categories/R/"/>
</entry>
</feed>