Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>From the wireshark comments, it looks like python-requests is doing it wrong, but that there might not be a "right answer".</p> <p><a href="http://tools.ietf.org/html/rfc2388" rel="noreferrer" title="RFC 2388">RFC 2388</a> says</p> <blockquote> <p>Field names originally in non-ASCII character sets may be encoded within the value of the "name" parameter using the standard method described in RFC 2047.</p> </blockquote> <p><a href="http://tools.ietf.org/html/rfc2047" rel="noreferrer" title="RFC 2047">RFC 2047</a>, in turn, says</p> <blockquote> <p>Generally, an "encoded-word" is a sequence of printable ASCII characters that begins with "=?", ends with "?=", and has two "?"s in between. It specifies a character set and an encoding method, and also includes the original text encoded as graphic ASCII characters, according to the rules for that encoding method.</p> </blockquote> <p>and goes on to describe "Q" and "B" encoding methods. Using the "Q" (quoted-printable) method, the name would be:</p> <pre><code>=?utf-8?q?=E2=98=83?= </code></pre> <p><strong>BUT</strong>, as <a href="http://tools.ietf.org/html/rfc6266" rel="noreferrer" title="RFC 6266">RFC 6266</a> clearly states:</p> <blockquote> <p>An 'encoded-word' MUST NOT be used in parameter of a MIME Content-Type or Content-Disposition field, or in any structured field body except within a 'comment' or 'phrase'. </p> </blockquote> <p>so we're not allowed to do that. (Kudos to @Lukasa for this catch!)</p> <p>RFC 2388 also says</p> <blockquote> <p>The original local file name may be supplied as well, either as a "filename" parameter either of the "content-disposition: form-data" header or, in the case of multiple files, in a "content-disposition: file" header of the subpart. The sending application MAY supply a file name; if the file name of the sender's operating system is not in US-ASCII, the file name might be approximated, or encoded using the method of RFC 2231.</p> </blockquote> <p>And <a href="http://tools.ietf.org/html/rfc2231" rel="noreferrer" title="RFC 2231">RFC 2231</a> describes a method that looks more like what you're seeing. In it,</p> <blockquote> <p>Asterisks ("*") are reused to provide the indicator that language and character set information is present and encoding is being used. A single quote ("'") is used to delimit the character set and language information at the beginning of the parameter value. Percent signs ("%") are used as the encoding flag, which agrees with RFC 2047.</p> <p>Specifically, an asterisk at the end of a parameter name acts as an indicator that character set and language information may appear at the beginning of the parameter value. A single quote is used to separate the character set, language, and actual value information in the parameter value string, and an percent sign is used to flag octets encoded in hexadecimal.</p> </blockquote> <p>That is, if this method is employed (and supported on both ends), the name should be:</p> <pre><code>name*=utf-8''%E2%98%83 </code></pre> <p>Fortunately, <a href="http://tools.ietf.org/html/rfc5987" rel="noreferrer" title="RFC 5987">RFC 5987</a> adds an encoding <em>based on RFC 2231</em> to HTTP headers! (Kudos to @bobince for this find) It says you can (any probably should) include both a RFC 2231-style value <em>and</em> a plain value:</p> <blockquote> <p>Header field specifications need to define whether multiple instances of parameters with identical parmname components are allowed, and how they should be processed. This specification suggests that a parameter using the extended syntax takes precedence. This would allow producers to use both formats without breaking recipients that do not understand the extended syntax yet.</p> <p>Example:</p> <blockquote> <p>foo: bar; title="EURO exchange rates"; title*=utf-8''%e2%82%ac%20exchange%20rates</p> </blockquote> </blockquote> <p>In their example, however, they "dumb down" the plain value for "legacy clients". This isn't really an option for a form-field name, so it seems like <strong>the best approach might be to include both</strong> <code>name=</code> and <code>name*=</code> versions, where the plain value is (as @bobince describes it) "just sending the bytes, quoted, in the same encoding as the form", like:</p> <pre><code>Content-Disposition: form-data; name="☃"; name*=utf-8''%E2%98%83 </code></pre> <p>See also:</p> <ul> <li><a href="https://stackoverflow.com/questions/324470/http-headers-encoding-decoding-in-java">HTTP headers encoding/decoding in Java</a></li> <li><a href="https://stackoverflow.com/questions/16782005/how-can-i-encode-a-filename-according-to-rfc-2231">How can I encode a filename according to RFC 2231?</a></li> <li><a href="https://stackoverflow.com/questions/93551/how-to-encode-the-filename-parameter-of-content-disposition-header-in-http">How to encode the filename parameter of Content-Disposition header in HTTP?</a></li> </ul> <p>Finally, see <a href="http://larry.masinter.net/1307multipart-form-data.pdf" rel="noreferrer">http://larry.masinter.net/1307multipart-form-data.pdf</a> (also <a href="https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909#c8" rel="noreferrer">https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909#c8</a> ), wherein it is recommended to avoid the problem by sticking with ASCII form field names.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload