Discussion:
FTP Through HTTP Proxy
Alexander Michael
2007-03-14 19:09:04 UTC
Permalink
I am hoping that someone on this list can point me in the right
direction. I've been using ftputil to download files from an FTP
server via an FTP proxy through a custom class derived from ftplib.FTP
(to go through the proxy). Unfortunately, my company is shutting down
thr FTP proxy and I am being forced to pull files through the HTTP
proxy (i.e. if I type ftp://username:***@ftpserver into my
browser, I can see a list of files on the server). It would be easiest
to transition if I could create another ftplib.FTP subclass that goes
through the HTTP proxy, but I do not know how to do this or even if it
can be done. Has anyone attempted to build something like this before?
Stefan Schwarzer
2007-03-25 20:29:39 UTC
Permalink
Hello Alexander,
Post by Alexander Michael
I am hoping that someone on this list can point me in the right
direction. I've been using ftputil to download files from an FTP
server via an FTP proxy through a custom class derived from ftplib.FTP
(to go through the proxy). Unfortunately, my company is shutting down
thr FTP proxy and I am being forced to pull files through the HTTP
browser, I can see a list of files on the server).
Since you access the path with the ftp:// protocol part
prepended, there seems to be no HTTP server or proxy involved.
Note that many web browsers support FTP directly, and, if I'm not
mistaken, the ftp:// means that HTTP isn't used at all.

Do you have any FTP proxies set in your browser configuration?

If you are using a Unix-type operating system, you should also
look at the ftp_proxy environment variable. I don't know what's
used on Windows. I assume you can set FTP proxies with a GUI
dialog or at least low-level with a registry editor.

Can you directly log into the FTP server with a standalone FTP
client? It may be that the company indeed shut down the FTP proxy -
and you now can access the server directly. :-)

Does that help?

Stefan
Alexander Michael
2007-03-26 15:04:55 UTC
Permalink
Post by Stefan Schwarzer
Since you access the path with the ftp:// protocol part
prepended, there seems to be no HTTP server or proxy involved.
Note that many web browsers support FTP directly, and, if I'm not
mistaken, the ftp:// means that HTTP isn't used at all.
Hmm. This would explain why I'm not convinced of my own analysis of
the situation, but yet, I remain uncertain.
Post by Stefan Schwarzer
Do you have any FTP proxies set in your browser configuration?
Yes. I set the HTTP proxy (the usual host "proxy" on port "8080") and
checked "Use this proxy server for all protocols."
Post by Stefan Schwarzer
If you are using a Unix-type operating system, you should also
look at the ftp_proxy environment variable.
I futzed with this, but couldn't get it to work. I'm working on both
linux and Windows, but testing on linux. The linux FTP client doesn't
seem to use FTP_PROXY, and I can't ftp into the HTTP proxy server.
Post by Stefan Schwarzer
Can you directly log into the FTP server with a standalone FTP
client? It may be that the company indeed shut down the FTP proxy -
and you now can access the server directly. :-)
I used to be able to do this. :)
Post by Stefan Schwarzer
Does that help?
Yes, it does. I am operating outside my realm of expertise (if indeed
I have such a realm) and these are good questions to ask myself as I
work to solve this issue. Thank you for responding.

Here's what I've been able to make work outside of Firefox (I cobbled
this together from some Google searches):

import urllib2

ph = urllib2.ProxyHandler(
{'ftp':'http://proxy_username:***@proxy:8080'})
passmgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
passmgr.add_password(None,
'http://proxy:8080', 'proxy_username', 'proxy_password')
au = urllib2.ProxyBasicAuthHandler(passmgr)
opener = urllib2.build_opener(ph, au)

index_fobj = opener.open(
'ftp://ftp_username:***@ftpserver')
print index_fobj.read() # print an html-ized dir list

data_fobj = opener.open(
'ftp://ftp_username:***@ftpserver/remote_file.dat')
open('remote_file.dat', 'wb').write(
data_fobj.read()) # get file from ftp server
Stefan Schwarzer
2007-03-26 16:07:12 UTC
Permalink
Hi Alexander,
Post by Alexander Michael
Post by Stefan Schwarzer
Do you have any FTP proxies set in your browser configuration?
Yes. I set the HTTP proxy (the usual host "proxy" on port "8080") and
checked "Use this proxy server for all protocols."
The HTTP proxy and FTP proxy are independent on each other (with
the exception that the provider may use the same host for both
proxies, but also then you can treat the proxies as independent).
So if you use the ftp:// protocol, the HTTP proxy shouldn't
matter at all.
Post by Alexander Michael
Post by Stefan Schwarzer
If you are using a Unix-type operating system, you should also
look at the ftp_proxy environment variable.
I futzed with this, but couldn't get it to work. I'm working on both
linux and Windows, but testing on linux. The linux FTP client doesn't
seem to use FTP_PROXY, and I can't ftp into the HTTP proxy server.
To my surprise, the environment variable is usually ftp_proxy
(lowercase) though generally environment variables are all
uppercase. Please try the lowercase va
Post by Alexander Michael
Post by Stefan Schwarzer
Can you directly log into the FTP server with a standalone FTP
client? It may be that the company indeed shut down the FTP proxy -
and you now can access the server directly. :-)
I used to be able to do this. :)
And ... now, without any proxy settings? :-)
Post by Alexander Michael
Post by Stefan Schwarzer
Does that help?
Yes, it does. I am operating outside my realm of expertise (if indeed
I have such a realm) and these are good questions to ask myself as I
work to solve this issue. Thank you for responding.
Here's what I've been able to make work outside of Firefox (I cobbled
import urllib2
[...]

You shouldn't use complicated code unless you are sure you need
it. :-) I suggest to try the direct login first if you haven't
already done so.

Stefan
Alexander Michael
2007-03-27 14:33:54 UTC
Permalink
The HTTP proxy and FTP proxy are independent [of] each other (with
the exception that the provider may use the same host for both
proxies, but also then you can treat the proxies as independent).
So if you use the ftp:// protocol, the HTTP proxy shouldn't
matter at all.
Actually, that's just it, they're not independent. With a little more
digging with Google, I've discovered that this mechanism for accessing
a remote FTP site over an HTTP proxy is called "FTP-over-HTTP" or
sometimes simply HFTP. Apparently, RFC 2616 (HTTP/1.1) accommodates
this mechanism for proxies and gateways with the Host header field.
When typing the "ftp" URL in Firefox configured to use a proxy,
Firefox actually sends the request via HTTP to the HTTP proxy server
and the proxy server acts as the FTP client with the remote FTP
server. The HTTP proxy then responds to the web browser via HTTP with
contents of the FTP URL.
To my surprise, the environment variable is usually ftp_proxy
(lowercase) though generally environment variables are all
uppercase. Please try the lowercase va
Well, the curl man page listed it in upper case, but this is a moot
point for my issue now that I understand it.
You shouldn't use complicated code unless you are sure you need
it. :-) I suggest to try the direct login first if you haven't
already done so.
Yeah. I was hoping to avoid this, but it has become clear to me what
is going on, and that I am forced to go this circuitous route.

I really appreciate the sanity checking here. Thanks for your help!

Alex

P.S. And yes, I am absolutely certain that our firewall is functioning
and that I must go through the proxy.
Stefan Schwarzer
2007-03-27 16:25:51 UTC
Permalink
Hi Alexander,
Post by Alexander Michael
The HTTP proxy and FTP proxy are independent [of] each other (with
the exception that the provider may use the same host for both
proxies, but also then you can treat the proxies as independent).
So if you use the ftp:// protocol, the HTTP proxy shouldn't
matter at all.
Actually, that's just it, they're not independent. With a little more
digging with Google, I've discovered that this mechanism for accessing
a remote FTP site over an HTTP proxy is called "FTP-over-HTTP" or
sometimes simply HFTP. Apparently, RFC 2616 (HTTP/1.1) accommodates
this mechanism for proxies and gateways with the Host header field.
When typing the "ftp" URL in Firefox configured to use a proxy,
Firefox actually sends the request via HTTP to the HTTP proxy server
and the proxy server acts as the FTP client with the remote FTP
server. The HTTP proxy then responds to the web browser via HTTP with
contents of the FTP URL.
Funny, I didn't think of something like that. There's always
something to learn. :)
Post by Alexander Michael
To my surprise, the environment variable is usually ftp_proxy
(lowercase) though generally environment variables are all
uppercase. Please try the lowercase va
Well, the curl man page listed it in upper case, but this is a moot
point for my issue now that I understand it.
I've seen both spellings, but in recent years only(?) the lowercase
variant.
Post by Alexander Michael
You shouldn't use complicated code unless you are sure you need
it. :-) I suggest to try the direct login first if you haven't
already done so.
Yeah. I was hoping to avoid this, but it has become clear to me what
is going on, and that I am forced to go this circuitous route.
Good luck! :-) Would you mind to post the code you ended up with?

Stefan

Loading...