This repository has been archived by the owner on Oct 12, 2017. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 15
/
DirectToDiskFileUpload
128 lines (91 loc) · 4.81 KB
/
DirectToDiskFileUpload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
= Handling of big file uploads directly to disk =
This work is based on the example at http://www.cherrypy.org/wiki/FileUpload adapted to CherryPy version 3.0.
== Main differences ==
* Filter replaced by a tool, disabling cherrypy's request body processing
* Default timeouts changed
* Default request body size limit changed
* Temporary file used by cgi.!FieldStorage changed to tempfile.!NamedTemporaryFile so as to avoid file copy after HTTP upload; this is very important when dealing with big files for speed and space efficiency reasons.
== The code ==
{{{
#!python
#!/usr/bin/python2.4
import cherrypy
import cgi
import tempfile
import os
__author__ = "Ex Vito"
class myFieldStorage(cgi.FieldStorage):
"""Our version uses a named temporary file instead of the default
non-named file; keeping it visibile (named), allows us to create a
2nd link after the upload is done, thus avoiding the overhead of
making a copy to the destination filename."""
def make_file(self, binary=None):
return tempfile.NamedTemporaryFile()
def noBodyProcess():
"""Sets cherrypy.request.process_request_body = False, giving
us direct control of the file upload destination. By default
cherrypy loads it to memory, we are directing it to disk."""
cherrypy.request.process_request_body = False
cherrypy.tools.noBodyProcess = cherrypy.Tool('before_request_body', noBodyProcess)
class fileUpload:
"""fileUpload cherrypy application"""
@cherrypy.expose
def index(self):
"""Simplest possible HTML file upload form. Note that the encoding
type must be multipart/form-data."""
return """
<html>
<body>
<form action="upload" method="post" enctype="multipart/form-data">
File: <input type="file" name="theFile"/> <br/>
<input type="submit"/>
</form>
</body>
</html>
"""
@cherrypy.expose
@cherrypy.tools.noBodyProcess()
def upload(self, theFile=None):
"""upload action
We use our variation of cgi.FieldStorage to parse the MIME
encoded HTML form data containing the file."""
# the file transfer can take a long time; by default cherrypy
# limits responses to 300s; we increase it to 1h
cherrypy.response.timeout = 3600
# convert the header keys to lower case
lcHDRS = {}
for key, val in cherrypy.request.headers.iteritems():
lcHDRS[key.lower()] = val
# at this point we could limit the upload on content-length...
# incomingBytes = int(lcHDRS['content-length'])
# create our version of cgi.FieldStorage to parse the MIME encoded
# form data where the file is contained
formFields = myFieldStorage(fp=cherrypy.request.rfile,
headers=lcHDRS,
environ={'REQUEST_METHOD':'POST'},
keep_blank_values=True)
# we now create a 2nd link to the file, using the submitted
# filename; if we renamed, there would be a failure because
# the NamedTemporaryFile, used by our version of cgi.FieldStorage,
# explicitly deletes the original filename
theFile = formFields['theFile']
os.link(theFile.file.name, '/tmp/'+theFile.filename)
return "ok, got it filename='%s'" % theFile.filename
# remove any limit on the request body size; cherrypy's default is 100MB
# (maybe we should just increase it ?)
cherrypy.server.max_request_body_size = 0
# increase server socket timeout to 60s; we are more tolerant of bad
# quality client-server connections (cherrypy's defult is 10s)
cherrypy.server.socket_timeout = 60
cherrypy.quickstart(fileUpload())
}}}
== Possible Improvements ==
* Maybe we don't need to lower case the headers for the cgi.!FieldStorage invocation ?
* os.link will fail if the destination name already exists - should be handled somehow
== Final Notes ==
My python and cherrypy experience is limited. You are welcome to improve and/or correct the code and style.
Note: It seems `FieldStorage` will not use `make_file()` if the file size is small (eg, <=1000 bytes?), so the file might actually be a file-like object(eg, StringIO) instead.
Note 2: You are correct, FieldStorage does not call 'make_file()' if the size of the file is < 1000 bytes. Workaround? I simply edit cgi.py as follows:
* Locate the function definition of "__write(self, line):"
* Delete the line "if self.__file.tell() + len(line) > 1000:"
* This or any other step will be absolutely necessary if you need files < 1000 bytes