Announcing: The "404 Project"
We all know that web applications are the new firewall. However, so far we had a hard time collecting web application logs. The hard part is to balance ease of install of a sensor (without disrupting the web application), fidelity of the log information and privacy.
With firewall logs, it is pretty simple. A rejected packet in a firewall has very little information and privacy isn't a big issue. Web application are different as the actual "meat" of the log event is in the request content, which may contain personal information. Parsing web logs isn't so easy either. Administrators frequently customize log formats for special purposes.
To balance these different issues we decided to focus on errors, but instead of parsing logs, we set up a little php script that you can add to your error page. In its current form, the script will work with PHP web servers (tested with Apache) that support the curl extension. Curl is installed by default in current versions of PHP.
Now all you need is an "error page". In Apache, just use the ErrorDocument configuration directive. For example:
ErrorDocument 404 /error.html
Will redirect users to "/error.html" in case of a 404 error [1]. You may already have a page like that configured. All you need to do is add the php snippet to the end, sending us the intended URL, the user agent and the IP address of the client access the missing page.
The hope is to collect data from automated probes, similar in how DShield's firewall logs reflect portscan activity.
In particular if you are running a personal / home web server: Please consider adding the collector script.
Once we get a few submitters, we will start adding continuously updated reports to the site, just like we do for the DShield data. However, we can't do this until we have at least a dozen submitters (better 100 or more) . We can not publish "one off" errors as they will likely be specific to your site and again could cause privacy issues.
Why do we only support PHP? Well, that's the language I know. Feel free to submit a .Net/Java/Ruby/Perl or whatever version of the script.
Simple steps to sign up:
- Login to retrieve your authentication key here https://isc.sans.edu/myinfo.html
- Download the php snippet here https://isc.sans.edu/tools/404project.html
- paste it into your Error Document
- test...
Please contact us if you have any questions.
[1] http://httpd.apache.org/docs/2.0/mod/core.html#errordocument
------
Johannes B. Ullrich, Ph.D.
SANS Technology Institute
Twitter
Application Security: Securing Web Apps, APIs, and Microservices | Online | US Eastern | Jan 27th - Feb 1st 2025 |
Comments
PHP is very dangerous if you do not configure it correctly, and if you do not understand EXACTLY what your applications are capable of doing!
A home user??? PHP??? What were you thinking!
Sorry but that is just crazy. Please change your article to warn would-be home server people of the dangers. For professionals, yes fine, but I for one will not run PHP do to it's hack history.
PHP does not even exist on my servers. I like it that way! I take the time to code in alternate languages. It is worth the time to learn, and the extra time to code.
PHP is also the language choice of hackers. Probably for two reasons..
1 - It is easy to learn.
2 - It is easy for the server administrator or the web application developer to make critical mistakes that can open the door to your data or your entire network.
Write that RUBY, PERL or APACHE Module in C now :-)
Al of Your Data Center
Jul 28th 2011
1 decade ago
If you don't need php, turn it off like any feature you don't need. But having a home web server to experiment is perfectly fine. Manage it well, monitor it, and make your mistakes with it before you start coding a real site with real customer data. DShield.org started out as an experiment like that. Just having it hosted in a "real datacenter" didn't make it any more secure.
Dr. J
Jul 28th 2011
1 decade ago
In the average home environment (there are many exceptions of course) you do not usually have all the technology you have in a data center to control flow, monitor traffic patterns, limit applications by their signature, etc. It is a risk, but true, the data the "visitor" gets will most likely be worthless to them.
The problem is when a machine gets owned. It can become a menace, like those IP addresses in China that attack regularly. Chances are the owners don't even know it is happening and are just pawns in many cases. Not all cases of course, but many.
So, yes I agree that it can be beneficial to test in a non-critical environment, but you also miss out on all of the great tools data center engineers have in place to control events.
Al of Your Data Center
Jul 28th 2011
1 decade ago
If we were running PHP I would already have recommended we re-write everything based just on the web app firewall data.
Although we whitelist everything to minimize our attack surface, we did make an exception and blacklisted requests looking for .php files and set an automatic IP Block on the source regardless of what the request contained. It's that prevalent.
JJ
Jul 28th 2011
1 decade ago
I know how to write perl script but not perl script to run on a webserver and not perl that I'd want to be probed by everybody and their uncle from the Internet.
I looked at the PHP and have about half of an idea on how to convert it to Perl. It would require libcurl and at least two modules WWW::Curl and MIME::Base64, but that's as far as I got.
I don't know how to pull the variables out of Apache to get User Agent, Redirect URL, or Remote URL.
Also, I'm not sure if perl has built-ins that would be better suited than calling out to libcurl.
Jason
Jul 28th 2011
1 decade ago
Nonetheless, mt apache web server writes a log file to /var/log/apache2/errors that can be post-processed to generate a report, similar to the dshield firewall logs stuff. Why do you need a script at all to run when the web server runs? Just batch process the logs in a cron job and ship them off to isc.
Moriah
Jul 28th 2011
1 decade ago
Create a list of valid environment variables. These are the ones you will want.
@valid_ENV = ('REMOTE_HOST','REMOTE_ADDR','REMOTE_USER','HTTP_USER_AGENT');
..then filter upon them to make sure nothing else gets by.
Example of how to use the variables..
$host = $ENV{'REMOTE_ADDR'};
Al of Your Data Center
Jul 28th 2011
1 decade ago
Al of Your Data Center
Jul 28th 2011
1 decade ago
And while you are chilling...read a book on Netiquette. All the unnecessary capitalization and punctuation marks are how 9 year olds type.
HackDefendr.com
Jul 28th 2011
1 decade ago
Al of Your Data Center
Jul 28th 2011
1 decade ago