This tutorial aims to create a web proxy server from scratch. It is split into 5 sections. A web proxy does not require any configuration settings such as proxy ip and port on your browser. Whereas conventional proxy servers found on corporate firwalls require these settings. Basically a web proxy is a web-application such as a php script that receives a request, creates a new connection to the requested resource, gets the data and then sends the data back to the client.
While coding such a proxy is fairly straightforward, the complexity comes from HTML and the HTTP protocol itself. You will need basic php skills, some basic regular expression skills and a rudinentary understaning of html and http.
Versions used in this example
How does a php web based proxy work? The proxy is actually a php application/web page.
1. Accept a url from a webpage (get request)
2. Open a cURL connection to the url you received.
3. Read the html.
4. Replace all the links in the html file to point to your php application
5. Return the output to the client.
- As an example, let us assume that a user wishes to connect to a page at http://mysite.com/out.html. The contents of the file are given below.
- Let us also assume that the php proxy is hosted at http://127.0.0.1/php/proxy.php. So you give it the url you want to view, like below (in your browser).
- Your proxy.php then recieves the url and opens up a cURL connection to the host and downloads the page.
- Then you use regular expressions or some other method to change all the urls to point to your proxy. The urls are usually base64 encoded to make them 'safe'
As you will notice all the urls have been replaced by something like
The string after the urlE actually contains the fully qualified url. If we look at line 4, the aHR0cDo... strin contains http://mysite.com/mysite.css encoded in base 64.
- Now send this changed output back to the requesting browser.
- Now when your script receives a request such as the one below,
You will have to base64 decode the string and make a cURL request to the site.
As you can see by rewriting all the urls, all requests for css and js files go through your proxy. Also, all links will now go via your proxy too.
In the next part we will look at the base cURL proxy skeleton to get us started.