Do you know what: It’s that simple to setup an HTTP forwarding proxy (proxy)
with ruby. The only thing you need is webrick
’s
HTTPProxyServer
-class.
If you don’t know what a proxy is, please have a look at this article on Wikipedia. In brief: An HTTP proxy is an intermediary server. It is located between the HTTP client and the HTTP server. Using this kind of architecture makes it easier to do virus scanning etc. of web traffic.
The HTTP client sends requests to the HTTP proxy which forwards those requests to the HTTP server – see Fig. 1. The responses are sent back to the proxy which sends them to a virus scanner server or even directly to the client.
Well known HTTP clients are
“Firefox”, a web browser, or “cURL” (curl
), a very good HTTP client for the
commandline. A good and well known HTTP proxy is
“Squid”. On
the server side you can use for instance “Apache HTTP server”, “nginx HTTP server” or “MS IIS”.
My use case
I’m working on two rubygems – proxy_pac_rb and proxy_rb – which provide helpers for rspec
-tests. The target group of those gems are
proxy-administrators who want to use rspec
to test their
proxy infrastructure. In order to develop proxy_rb
and run tests without an external
network connection, a local proxy is required. I’m not that keen to use “Squid”
for this use case because I would then need to install it on
travis-ci and
maintain its configuration. Instead I would like to have something much simpler
and easier to extend.
Why using “webrick”
I was quite happy when I found out, that webrick
supports an HTTP proxy
server. For the following reasons I decided to give
it a try.
- It’s written in
ruby
and it’s part ofruby-core
- It supports HTTP’s “GET”- and “CONNECT”-methods – the latter is required to forward encrypted HTTP traffic (HTTPS)
- It supports forwarding requests to another proxy plus proxy authentication against that proxy
Architectures
I will cover a few architectures in this article. Each architecture makes use
of the same test infrastructure – e.g. web servers – which I will
describe in the next section. There’s also a chapter at the end of this article about issues I had with webrick
.
- Simple Forward Proxy
- Forward Proxy Requiring Authentication
- Proxy Chain with Upstream and Downstream Proxy
- Downstream Proxy With Authentication Against Upstream Proxy
Common Test Infrastructure
Fig 2. illustrates the infrastructure which I used for all my tests.
It’s made up of an HTTP client, an HTTP and an HTTPS
server. In all other figures both servers are represented by a single icon
labeled with “HTTP(S) servers”. I am going to use curl
as HTTP client.
webrick
is used as HTTP
server and HTTPS
server.
Creating the test directory
To start with the tests, please create a temporary directory and make it your current working directory.
# Create the test directory
mkdir -p ~/tmp/proxy_test
# Make the directory your current working directory
cd ~/tmp/proxy_test
Install “webrick”-gem
gem install webrick
Creating the HTTP server
Create a file named http_server.rb
within the test-directory with the
following content. The script will start a webrick
-HTTP server which listens
on port “8000”.
#!/usr/bin/env ruby
require 'webrick'
server = WEBrick::HTTPServer.new(
:Port => 8000,
)
server.mount_proc '/' do |req, res|
res.body = 'Example Domain Cleartext'
end
server.start
After that, make the script an executable.
chmod +x http_server.rb
Creating the HTTPS server
Next create a file named https_server.rb
– please mind the “s” at the end of “https” – with the
following content. The script will start an HTTPS server listening on port
“8443” using a self-signed certificate – see the webrick
documentation for more information.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/https'
cert_name = [
%w[CN localhost],
]
server = WEBrick::HTTPServer.new(
:Port => 8443,
:SSLEnable => true,
:SSLCertName => cert_name
)
server.mount_proc '/' do |req, res|
res.body = 'Example Domain Encrypted'
end
server.start
If you pass the HTTPServer
-class a SSLCertName
-parameter, this will be used
as “Distinguished Name” of the generated certificate.
cert_name = [
%w[CN localhost],
]
server = WEBrick::HTTPServer.new(
:Port => 8443,
:SSLEnable => true,
:SSLCertName => cert_name
)
After that, make the script also an executable.
chmod +x https_server.rb
Starting the servers
Start both the HTTP server and the HTTPS server each in a separate terminal.
# Start HTTP server
./http_server.rb
# => [2015-04-16 07:19:17] INFO WEBrick 1.3.1
# => [2015-04-16 07:19:17] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => [2015-04-16 07:19:17] INFO WEBrick::HTTPServer#start: pid=15600 port=8000
You will get a slightly different output when you start the HTTPS server.
webrick
will generate a self-signed SSL server certificate on the fly
when you start it.
# Start HTTPS server
./https_server.rb
# => [2015-04-16 07:19:19] INFO WEBrick 1.3.1
# => [2015-04-16 07:19:19] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => ................++++++
# => ...................++++++
# => [2015-04-16 07:19:19] INFO
# => Certificate:
# => Data:
# => Version: 3 (0x2)
# => [...]
# => [2015-04-16 07:19:19] INFO WEBrick::HTTPServer#start: pid=15616 port=8443
Proxy Architectures
Make sure you’ve got both the HTTP and HTTPS servers running before you read any further. Please follow the instructions found in the previous section, if you started reading here.
SIMPLE FORWARD PROXY
Fig. 3 illustrates a simple forward proxy infrastructure. It’s made up of an HTTP client, a single HTTP proxy and the HTTP(S) servers.
Creating the proxy
Create a file named proxy.rb
with the following content.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/httpproxy'
proxy = WEBrick::HTTPProxyServer.new Port: 8080
trap 'INT' do proxy.shutdown end
trap 'TERM' do proxy.shutdown end
proxy.start
After that make it an executable.
chmod +x proxy.rb
Starting the proxy
After you have successfully created the file, run the proxy.
./proxy.rb
# => [2015-04-16 07:21:12] INFO WEBrick 1.3.1
# => [2015-04-16 07:21:12] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => [2015-04-16 07:21:12] INFO WEBrick::HTTPProxyServer#start: pid=15771 port=8080
Sending requests
Make sure you pass curl
the fully qualified URL “http://localhost:8000” and
“https://localhost:8443”. Please also add -k
to the curl
-command to prevent errors
because of the self-signed SSL certificate. You should send the request to the
proxy which listens on port “8080” via curl’s -x
-parameter.
curl -x localhost:8080 http://localhost:8000
# => Example Domain Cleartext
curl -k -x localhost:8080 https://localhost:8443
# => Example Domain Encrypted
You successfully created a forward proxy using webrick
. Congratulations!
FORWARD PROXY REQUIRING AUTHENTICATION
Fig 4. shows a simple forward proxy infrastructure where the client needs to authenticate itself against the proxy. It’s made up of an HTTP client, an HTTP proxy and the HTTP(S) servers.
Creating the proxy
Create a file named proxy.rb
with the following content.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/httpproxy'
# Apache compatible Password manager
htpasswd = WEBrick::HTTPAuth::Htpasswd.new File.expand_path('../htpasswd', __FILE__)
# Create entry with username and password, the password is "crypt" encrypted
htpasswd.set_passwd 'Proxy Realm', 'user', 'password'
# Write file to disk
htpasswd.flush
# Authenticator
authenticator = WEBrick::HTTPAuth::ProxyBasicAuth.new(
Realm: 'Proxy Realm',
UserDB: htpasswd
)
proxy = WEBrick::HTTPProxyServer.new Port: 8080, ProxyAuthProc: authenticator.method(:authenticate).to_proc
trap 'INT' do proxy.shutdown end
trap 'TERM' do proxy.shutdown end
proxy.start
The Htpasswd
-class provides access to a htpasswd
-file. In the current
implementation of webrick
only the
crypt
-algorithm is supported. All other
algorithms supported by apache
are not supported yet by webrick
.
# Apache compatible Password manager
htpasswd = WEBrick::HTTPAuth::Htpasswd.new File.expand_path('../htpasswd', __FILE__)
We also create a user “user” with passsword “password” within that script for
our tests (#set_passwd
) and store the information on the hard drive (#flush
).
# Create entry with username and password, the password is "crypt" encrypted
htpasswd.set_passwd 'Proxy Realm', 'user', 'password'
# Write file to disk
htpasswd.flush
To make use of the htpasswd
-file you need to pass it to a
webrick
-authenticator – e.g. ProxyBasicAuth
.
# Authenticator
authenticator = WEBrick::HTTPAuth::ProxyBasicAuth.new(
Realm: 'Proxy Realm',
UserDB: htpasswd
)
To activate the authentication you need to pass HTTPProxyServer
a
proc
-object via the ProxyAuthProc
-parameter. That’s why we need to convert the
authenticate
-method of our ProxyBasicAuth
-object into a proc
-object.
proxy = WEBrick::HTTPProxyServer.new Port: 8080, ProxyAuthProc: authenticator.method(:authenticate).to_proc
After that make the proxy.rb
-file an executable.
chmod +x proxy.rb
Starting the proxy
After you have successfully created the file, run the proxy.
./proxy.rb
# => [2015-04-16 07:34:09] INFO WEBrick 1.3.1
# => [2015-04-16 07:34:09] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => [2015-04-16 07:34:09] INFO WEBrick::HTTPProxyServer#start: pid=3650 port=8080
Sending requests
At first we will send a request without any credentials. The request will fail with HTTP status code “407” – Proxy Authentication Required.
curl -x localhost:8080 http://localhost:8000
# => <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
# => <HTML>
# => <HEAD><TITLE>Proxy Authentication Required</TITLE></HEAD>
# => <BODY>
# => <H1>Proxy Authentication Required</H1>
# => WEBrick::HTTPStatus::ProxyAuthenticationRequired
# => <HR>
# => <ADDRESS>
# => WEBrick/1.3.1 (Ruby/2.2.1/2015-04-16) at
# => localhost:8000
# => </ADDRESS>
# => </BODY>
# => </HTML>
After that, we will invoke curl
with the correct credentials and the request
will succeed.
curl -x localhost:8080 -U user:password http://localhost:8000
# => Example Domain Cleartext
If you enter passwords which start with the correct one, the authentication
will also succeed. That’s why “passwordlonger” is also accepted as password by
webrick
. I consider this a flaw of the used crypt
-algorithm.
curl -x localhost:8080 -U user:passwordlonger http://localhost:8000
# => Example Domain Cleartext
PROXY CHAIN WITH UPSTREAM AND DOWNSTREAM HTTP PROXY
Fig 5. shows a forward proxy chain. It’s made up of an HTTP client, a downstream HTTP proxy, an upstream HTTP proxy and the HTTP(S) servers.
Creating the downstream HTTP proxy
Create a file named proxy-downstream.rb
with the following content.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/httpproxy'
require 'uri'
proxy = WEBrick::HTTPProxyServer.new Port: 8080, ProxyURI: URI('http://localhost:9090')
trap 'INT' do proxy.shutdown end
trap 'TERM' do proxy.shutdown end
proxy.start
The most interesting part of the script is the following line. Using the
ProxyURI
-parameter, you can inform webrick
to use an upstream proxy. Since
webrick
depends on the URI
-API for the mentioned parameter, it’s important to
use URI
,
Addressable::URI
or something which has a compatible API.
proxy = WEBrick::HTTPProxyServer.new Port: 8080, ProxyURI: URI('http://localhost:9090')
After that make it an executable.
chmod +x proxy-downstream.rb
Creating the upstream HTTP proxy
Create a file named proxy-upstream.rb
with the following content.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/httpproxy'
proxy = WEBrick::HTTPProxyServer.new Port: 9090
trap 'INT' do proxy.shutdown end
trap 'TERM' do proxy.shutdown end
proxy.start
After that make it an executable.
chmod +x proxy-upstream.rb
Starting the proxies
Next start both proxies each in a separate terminal.
# Start downstream HTTP proxy
./proxy-downstream.rb
# => [2015-04-16 07:22:02] INFO WEBrick 1.3.1
# => [2015-04-16 07:22:02] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => [2015-04-16 07:22:02] INFO WEBrick::HTTPProxyServer#start: pid=15919 port=8080
# Start upstream HTTP proxy
./proxy-upstream.rb
# => [2015-04-16 07:21:59] INFO WEBrick 1.3.1
# => [2015-04-16 07:21:59] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => [2015-04-16 07:21:59] INFO WEBrick::HTTPProxyServer#start: pid=15913 port=9090
Sending Request
After that, we will try to access “http://localhost:8000” via the downstream proxy.
curl -x localhost:8080 http://localhost:8000
# => Example Domain Cleartext
And to make sure it works with SSL/TLS as well, we will try to access “https://localhost:8443” via the downstream HTTP proxy.
curl -k -x localhost:8080 https://localhost:8443
# => Example Domain Encrypted
It work’s! Good job, mate.
DOWNSTREAM PROXY WITH AUTHENTICATION AGAINST UPSTREAM PROXY
Fig 6. illustrates an infrastructure where the downstream proxy provides username + password to the upstream proxy for each request. The infrastructure is made up of an HTTP client, a downstream HTTP proxy, an HTTP upstream proxy and the HTTP(S) servers.
Creating the downstream HTTP proxy
Create a file named proxy-downstream.rb
with the following content.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/httpproxy'
require 'ostruct'
uri = OpenStruct.new
uri.userinfo = 'user:password'
uri.host = 'localhost'
uri.port = 9090
proxy = WEBrick::HTTPProxyServer.new Port: 8080, ProxyURI: uri
trap 'INT' do proxy.shutdown end
trap 'TERM' do proxy.shutdown end
proxy.start
By adding a username and a password to the ProxyURI
-parameter, webrick
will add
this information to each request which is forwarded to the upstream proxy.
If you need to be compliant to some password policies which require special
characters in your password, you might want to consider using
OpenStruct
instead of good old URI
. Just make sure the object created is API-compatible with URI
.
Neither URI
nor Addressable/URI
accept non-conform URIs. Using OpenStruct
enables you to use passwords which are not valid regarding RFC 2396
– Uniform Resource Identifiers (URI): Generic
Syntax.
uri = OpenStruct.new
uri.userinfo = 'user:password'
uri.host = 'localhost'
uri.port = 9090
proxy = WEBrick::HTTPProxyServer.new Port: 8080, ProxyURI: uri
After creating the file make it an executable.
chmod +x proxy-downstream.rb
Creating the upstream HTTP proxy
Create a file named proxy-upstream.rb
with the following content.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/httpproxy'
# Apache compatible Password manager
htpasswd = WEBrick::HTTPAuth::Htpasswd.new File.expand_path('../htpasswd', __FILE__)
# Create entry with username and password, the password is "crypt" encrypted
htpasswd.set_passwd 'Proxy Realm', 'user', 'password'
# Write file to disk
htpasswd.flush
# Authenticator
authenticator = WEBrick::HTTPAuth::ProxyBasicAuth.new(
Realm: 'Proxy Realm',
UserDB: htpasswd
)
proxy = WEBrick::HTTPProxyServer.new Port: 9090, ProxyAuthProc: authenticator.method(:authenticate).to_proc
trap 'INT' do proxy.shutdown end
trap 'TERM' do proxy.shutdown end
proxy.start
After that make it an executable.
chmod +x proxy-upstream.rb
Starting the proxies
Next start both proxies in separate terminals.
# Start downstream HTTP proxy
./proxy-downstream.rb
# => [2015-04-16 07:22:59] INFO WEBrick 1.3.1
# => [2015-04-16 07:22:59] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => [2015-04-16 07:22:59] INFO WEBrick::HTTPProxyServer#start: pid=16028 port=8080
# Start upstream HTTP proxy
./proxy-upstream.rb
# => [2015-04-16 07:22:53] INFO WEBrick 1.3.1
# => [2015-04-16 07:22:53] INFO ruby 2.2.1 (2015-04-16) [x86_64-linux]
# => [2015-04-16 07:22:53] INFO WEBrick::HTTPProxyServer#start: pid=16002 port=9090
Sending Request
After that, we will try to access “http://localhost:8000” via the downstream proxy.
curl -x localhost:8080 http://localhost:8000
# => Example Domain Cleartext
And to make sure it works with SSL/TLS as well, we will try to access “https://localhost:8443” via the downstream HTTP proxy.
curl -k -x localhost:8080 https://localhost:8443
# => Example Domain Encrypted
It works! ☺
BONUS: Creating a user.service-unit for systemd-user
If you follow the steps given below, you will get a proxy-daemon which runs
with your userid. This approach is based on systemd-user
– see the Arch Linux
Wiki for more information on this topic. If you want to run this solution
on your local machine to authenticate against some company proxy, please change
the host and username/password information as needed. And PLEASE make sure,
that you do NOT violate your company rules by using this solution. I will be
NOT liable if you get in conflict with those rules.
First, create the bin
-directory in your HOME-directory.
mkdir -p ~/bin
Then copy the file proxy-downstream.rb
to ~/bin/proxy
and make sure it is an executable.
# Copy file
cp proxy-downstream.rb ~/bin/proxy
# Make it executable
chmod +x ~/bin/proxy
After that, create a service-unit at ~/.config/systemd/user/proxy.service
.
Please change the <userid>
-string in the service-unit to match your user id.
[Unit]
Description=Web Proxy Cache Server
[Service]
ExecStart=/home/<userid>/bin/proxy
[Install]
WantedBy=default.target
Next you need to enable and start that service.
# Enable
systemctl --user enable proxy
# Start
systemctl --user start proxy
After that we can send a request to the local proxy.
curl -x localhost:8080 https://example.org
Running the status
-command of systemctl
, you will see the proxy is running
and forwarded your request.
systemctl --user status proxy
# => ● proxy.service - Web Proxy Cache Server
# => Loaded: loaded (/home/<userid>/.config/systemd/user/proxy.service; enabled; vendor preset: enabled)
# => Active: active (running) since Thu 2015-04-16 07:30:14 CEST; 0 day 0h ago
# => Main PID: 4090 (ruby)
# => CGroup: /user.slice/user-1000.slice/user@1000.service/proxy.service
# => └─4090 ruby /home/<userid>/bin/proxy
# =>
# => Apr 16 07:31:00 localhost proxy[4090]: localhost.localdomain - - [16/Apr/2015:07:31:00 CEST] "CONNECT www.example.org:443 HTTP/1.1" 200 0
# => Apr 16 07:31:00 localhost proxy[4090]: - -> www.example.org:8443
Troubleshooting
Fully qualified url
At first I was a little bit disappointed, because the first request I made to
the webrick
proxy failed.
curl -x localhost:8080 localhost:8000
The output was not what I expected: Not Found
. What the hell?
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<HTML>
<HEAD><TITLE>Not Found</TITLE></HEAD>
<BODY>
<H1>Not Found</H1>
`/' not found.
<HR>
<ADDRESS>
WEBrick/1.3.1 (Ruby/2.2.1/2015-04-16) at
localhost:8000
</ADDRESS>
</BODY>
</HTML>
The output of the DEBUG
-Logger I added to the HTTPProxyServer
-instance did
not help that much.
#!/usr/bin/env ruby
require 'webrick'
require 'webrick/httpproxy'
require 'logger'
logger = Logger.new($stderr)
logger.level = Logger::DEBUG
proxy = WEBrick::HTTPProxyServer.new Port: 8080, Logger: logger
trap 'INT' do proxy.shutdown end
trap 'TERM' do proxy.shutdown end
proxy.start
Sending the request again, I couldn’t figure out what was wrong.
curl -x localhost:8080 localhost:8000
I got this output.
% ./proxy.rb
I, [2015-04-17T07:24:34.163790 #16247] INFO -- : WEBrick 1.3.1
I, [2015-04-17T07:24:34.163912 #16247] INFO -- : ruby 2.2.1 (2015-04-16) [x86_64-linux]
I, [2015-04-17T07:24:34.164231 #16247] INFO -- : WEBrick::HTTPProxyServer#start: pid=16247 port=8080
D, [2015-04-17T07:24:59.846018 #16247] DEBUG -- : accept: ::1:38334
E, [2015-04-17T07:24:59.847385 #16247] ERROR -- : `/' not found.
localhost.localdomain - - [17/Apr/2015:07:24:59 CEST] "GET HTTP://localhost:8000/ HTTP/1.1" 404 270
- -> HTTP://localhost:8000/
D, [2015-04-17T07:24:59.847876 #16247] DEBUG -- : close: ::1:38334
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<HTML>
<HEAD><TITLE>Not Found</TITLE></HEAD>
<BODY>
<H1>Not Found</H1>
`/' not found.
<HR>
<ADDRESS>
WEBrick/1.3.1 (Ruby/2.2.1/2015-04-16) at
localhost:8000
</ADDRESS>
</BODY>
</HTML>
Adding the -v
-option to the curl-command provided me the
information I needed to understand what the problem was.
curl -v -x localhost:8080 localhost:8000
The request line “HTTP://localhost:8000/ HTTP/1.1
” looked strange.
% curl -v -x localhost:8080 localhost:8000
* Rebuilt URL to: localhost:8000/
* Trying ::1...
* Connected to localhost (::1) port 8080 (#0)
> GET HTTP://localhost:8000/ HTTP/1.1
> User-Agent: curl/7.41.0
> Host: localhost:8000
> Accept: */*
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 404 Not Found
< Content-Type: text/html; charset=ISO-8859-1
< Server: WEBrick/1.3.1 (Ruby/2.2.1/2015-04-16)
< Date: Fri, 16 Apr 2015 05:26:01 GMT
< Content-Length: 270
< Connection: close
<
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<HTML>
<HEAD><TITLE>Not Found</TITLE></HEAD>
<BODY>
<H1>Not Found</H1>
`/' not found.
<HR>
<ADDRESS>
WEBrick/1.3.1 (Ruby/2.2.1/2015-04-16) at
localhost:8000
</ADDRESS>
</BODY>
</HTML>
* Closing connection 0
I added http://
to the url and everything worked.
% curl -v -x localhost:8080 http://localhost:8000
* Rebuilt URL to: http://localhost:8000/
* Trying ::1...
* Connected to localhost (::1) port 8080 (#0)
> GET http://localhost:8000/ HTTP/1.1
> User-Agent: curl/7.41.0
> Host: localhost:8000
> Accept: */*
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 200 OK
* HTTP/1.1 proxy connection set close!
< Proxy-Connection: close
< Connection: close
< Server: WEBrick/1.3.1 (Ruby/2.2.1/2015-04-16)
< Date: Fri, 16 Apr 2015 05:27:11 GMT
< Content-Length: 24
< Via: 1.1 vrml0067_01.in.vrnetze.de:8080
<
* Closing connection 0
Example Domain Cleartext
Startup of servers and proxies
You may wonder, that sometimes it took some minutes for webrick
to start up. You need
to have your DNS resolution working if your computer is connected to a network.
Otherwise it takes longer to start up the HTTP(S) servers and the
proxies: webrick
tries to determine its hostname. If the configured DNS
servers are not reachable, webrick
will wait until the connection to the DNS
server(s) times out before starting up.
Conclusion
webrick
is not a perfect solution to setup an HTTP proxy, but is easy to use
and has all needed features for my use case.
The End
Thanks for reading!