As the system administrator of a school, you are constantly faced with the question of how far you should filter content from the Internet. This question must be answered wherever children and young people have access to the Internet, whether in schools, clubs, libraries, at home or any other public institution. Opinions on this subject are very diverse. There is no 100% protection. It is much more important to teach children and young people how to use the Internet responsibly. This is a very big challenge and takes time. Parents and educators are faced with this task and often do not know how best to approach it. Especially in schools, where you can’t always keep an eye on the screens, a web filter is a great help. In some countries, a web filter for schools is even required by law. But sometimes it’s just about blocking certain websites, such as Facebook, Netflix & Co. Therefore, in this tutorial I would like to show you how to set up a pfSense web filter.


No time to read this article now?

Download this article as PDF


Preliminary Remarks

pfSense is a widely used open source firewall that we use at our school. With the help of Squid (a proxy server) and SquidGuard (the actual web filter) we want to filter HTTP and HTTPS connections. For this tutorial we first need an active pfSense installation. The firewall can be downloaded here and installed according to these instructions.

How it works

Filtering HTTP connections is very easy and quick to set up. Since these connections are unencrypted, it is possible to examine them well and therefore block them completely or partially. Nowadays, more and more websites (even those you would like to block) use HTTPS, i. e. an encrypted connection between the user’s browser and the web server. Thanks to Let’s Encrypt, anyone can now set up a free certificate for their website. This is a good thing in itself, because it increases security and makes many attacks impossible or more difficult. However, it also makes filtering for unwanted content more difficult.

This “problem” can be solved in two ways:

1. man-in-the-middle attack

One way is a conscious man-in-the-middle attack. The proxy server decrypts the HTTPS connection and rebuilds it. This allows them to view the connection and filter it accordingly. This concept is used by most web filter solution providers. The problem here is that this profound interference with the HTTPS connection means that the actual security provided by HTTPS is no longer guaranteed. A user can hardly recognize the difference if the certificate of the proxy server is trusted. But this security is deceptive. Even if this is the only way to speak of true content filtering, this solution is dangerous, very risky (implementation is not trival) and, depending on the country, incompatible with the prevailing laws (keyword data protection and privacy). Therefore, this route is not recommended for safety and moral reasons.

2. URL filter via SNI

Another possibility is filtering via SNI (Server Name Indication). Before the certificate is queried between browser and web server and thus an encrypted connection is established, the browser sends the domain name (FQDN) that it wants to query. This part is not yet encrypted and can therefore be read by a (transparent) proxy and used for filtering. The following figure illustrates the TLS handshake.

TLS Handshake

You can easily see that the SNI is sent before the key exchange and the actual secure connection. We take advantage of this principle and in addition to the web filter for HTTP connections, we can also set up a URL filter for HTTPS connections without destroying HTTPS by a man-in-the-middle attack.

Safe-Search for search engines

Create firewall rules for DNS

Since we can’t look into an HTTPS connection, unwanted images and videos may appear in a Google search, for example. Google and other search engines therefore offer a secure mode (Safe-Search) because we want to force it.

First we have to activate the DNS resolver in pfSense (under Services → DNS Resolver) and then save and apply the changes.

DNS Resolver

In order for the computers in the network to use the DNS server of the firewall, we need a rule that forwards all other DNS requests to the firewall. To do this, we create a new rule under Firewall → NAT in the Port Forward tab with a click on one of the two add buttons. We enter the following:

  1. Interface: LAN
  2. Protocol: TCP/UDP
  3. Destination: Any
  4. Destination Port Range: DNS (53)
  5. Redirect Traget IP: 127.0.0.1
  6. Redirect Target Port: DNS (53)
  7. Description: Can be freely selected

NAT Redirect

Now we have to make sure that our newly created firewall rule is in the right place. It must be above the default “Default allow LAN to any rule“! To do this, we open the firewall rules under Firewall → Rules and move the rule up. Then save with Save and Apply to apply the changes.

LAN Rules

Host Overrides for Bing and Youtube

Next, we’ll create some DNS entries to make sure that their safe search is used for both Google and Bing. To do this, we open the DNS Resolver again under Services → DNS Resolver and add the following entries in the section Host Overrides below.

Bing:

  • Host: www
  • Domain: bing. com
  • IP Address: 204.79.197.220
  • Description: Bing
  • Then save with Save

Then the entry for Youtube:

  • Host: www
  • Domain: youtube. com
  • IP Address: 216.239.38.120
  • Description: Youtube
  • Save again with Save

Host Overrides

Now apply the changes again with Apply.

Host Overrides for Google

Google uses a lot of different domains and it would take quite a long time to enter them manually. That’s why we choose a different way for Google. First, we need to log in to pfSense via SSH (or connect a screen + keyboard if the pfSense is installed on a computer with a graphics card). SSH must first be enabled in the web interface and System → Advanced in the Secure Shell section.

Enable SSH

Now we can log in with the following command via SSH (adjust IP address!):

In the following menu we select “8” (Shell).

pfSense Shell

Now we create a file in which we later enter our DNS entries for Google. We can do that with the following order:

To exit the editor, we need to enter :wq (the colon is important!). That’s all we need to do on the command line.

We can now edit the newly created file using Diagnostics → Edit File. To do this, enter the path /var/unbound/google.conf and press Load.

Edit File

Now we copy the following content into the file:

With a click on Save we save the file.

The last step is to tell our DNS server where to find these DNS records. To do this, open the DNS server settings under Services → DNS Resolver and click on Display Custom Options. There we insert the following lines, save them with Save and apply the changes with Apply.

DNS Custom Options

Our search engines are configured. The next step is to set up the content filter for HTTP and the URL filter for HTTPS.

Squid Proxy and SquidGuard

Installation

To enable pfSense to filter the URLs, we need a proxy server through which all requests from our network are routed. For this we use Squid. As the name suggests, SquidGuard is the actual filter. Under System → Package Manager in the Available Packages tab we install Squid and SquidGuard.

Package Manager

Setting Up Transparent Proxy for HTTP

Under Services → Squid Proxy Server we now set up the transparent proxy for HTTP. A transparent proxy has the advantage that we do not have to configure any settings on the individual computers in our network. In the General tab we activate the following items:

  1. Enable Squid Proxy ✔
  2. Proxy Interface (s): LAN
  3. Allow users on interface ✔
  4. Transparent HTTP Proxy ✔
  5. Transparent Proxy Interface (s): LAN

Squid Proxy General

After saving with Save we determine in the tab Local Cache how much disk space should be used for the cache (here 500MB):

Squid Cache

The settings have to be saved again with Save. The transparent proxy for HTTP connections is now set up.

Configuring SquidGuard

SquidGuard is the component responsible for filtering the content. Each request is examined by SquidGuard and then decided whether or not to block the request or the website. For this we use a blacklist, which we configure later. Before that, we’ll define some general settings under Services → SquidGuard Proxy Filter.

  1. Enable ✔
  2. (not shown in the screenshot)
  3. Enable Log ✔
  4. Enable log rotation ✔
  5. Enable Blacklist ✔
  6. Blacklist URL: http://www.shallalist.de/Downloads/shallalist.tar.gz

Squid Proxy Filter General

Below we save everything again with Save.

With the SquidGuard we have to keep in mind that changes in the configuration only become active after we have clicked Save and Apply (above in the General Settings tab)!

Setting up blacklists and whitelists

Now that we are done with the basic settings, the blacklists and whitelists are missing. The URL for the blacklist is already given. Now we have to download them in the tab “Blacklist“.

Blacklist

In order to make sure that our filter works, we are now defining several target categories. To do so, open the tab “Target Categories” and click on Add. We create a whitelist of all domain names we explicitly allow. That would be e. g. all Google domains, because we will block all other search engines in order to prevent the user from bypassing the Safe-Search feature set up above.

We will enter the following:

  1. Name: Whitelist
  2. Domain List:
  3. Description: Whitelist
  4. Save with Save.

whitelist

The last step for the time being is to establish some rules. We do this in the Common ACL tab. Then click on the “+” sign in “Target Rules List” to open a list of the different rule sets. There are now different categories and our whitelist appears here. We now make the following settings:

  • Whitelist: access whitelist
  • Default access [all]: access allow

Common ACL

The other categories can be set as required. Here are some examples:

  • Block advertising:[blk_BL_adv] access deny
  • Block pornography:[blk_BL_porn] access deny
  • etc.

To prevent a user from bypassing our URL filter by entering the IP address of a page, we still enable Do not allow IP addresses in URL. If this setting causes problems, you should deactivate it again.

Do not allow IP addresses

Then we save with Save, switch to the General Settings tab and press Apply again to apply our changes.

Test Setup

Everything is set up for HTTP connections and we can test the setup. Nothing else needs to be set up on a computer in the LAN. The filter should already work. If we visit a page that appears in one of our blacklists, this page will appear:

Blocked sites

Transparent proxy for HTTPS connections

Up to now, the transparent proxy is only active for HTTP, i. e. unencrypted requests. At the beginning of this article I already pointed out the difficulties in filtering encrypted, i. e. HTTPS connections. In our case, we will activate a transparent proxy for HTTPS, which allows us to enable a URL filter for all requests on port 443 (HTTPS), but with the disadvantage that we cannot (and don’t want to!) analyze the content and we can’t do a nice error page. Instead, the browser will display a certificate error message. But more on this soon.

First we activate the transparent proxy for HTTPS. To do this, open the proxy settings under Services → Squid Proxy Server and select the following settings in the SSL Man in the Middle Filtering section:

  • HTTPS / SSL Interception ✔
  • SSL/MITM Method: Splice All
  • SSL Intercept Interfaces: LAN
  • CA: Select a Certificate Authority Certificate. Maybe we’ll have to create one first. (under System → Cert. Manager).
  • Save all with Save.

SSL MITM

Now everything is set up and we can also test HTTPS connections. As already written, this time we don’t get an informative error message like for HTTP connections, but a warning from the browser:

Cert Error

Even though this error message is not very meaningful, we have achieved our real goal of blocking unwanted pages.

Conclusion

We have now set up a system that filters all network traffic in our LAN (or WLAN). This blocks pages that have been defined using the blacklists.

The pros and cons of such locks have different positions. In any case, it is a problem that cannot and should not be solved 100% technically, since it is rather a question of educating (young) people to be able to deal responsibly with the medium “Internet”. It is certainly not the right way to achieve this goal by means of such filtering alone. The fact that children and young people are “accustomed” to censorship and filtering is also viewed critically by some.

On the other hand, it is especially helpful for schools, libraries or at home if you can limit the amount of non-appropiate content. Some countries also prescribe such a filter by law!

(Source: https://forum.pfsense.org/index.php?topic=112335.0)


Download this article as PDF


Stephan

Stephan

I'm a teacher and IT system administrator in an international school. I love open source software and I used it over a decade in my private and work life. My passion is to solve problems with open source software!

18 Comments

craig · March 31, 2018 at 12:07 am

Hi, Thanks for your good work. We went through your process and had great success! However, we are not able to block google mail mail services. Any insight?

    Stephan

    Stephan · April 5, 2018 at 11:42 am

    @Craig: I think its because all the Google Domains are in the white list. Do you want to block the domain or also the Google Mail App on smartphones?

Martin · April 19, 2018 at 1:09 am

Excellent!

Greta · April 19, 2018 at 7:33 pm

Hi Steph, i found this tutorial very interesting, it reminded me of a special methodology used by our company named La Rache : https://www.la-rache.com/

Basil · April 20, 2018 at 6:41 pm

Hello
It is really one of the most benifical tutorial I have ever read
I will deploy this hope to success like you.

Basil · April 20, 2018 at 6:46 pm

But does all https traffic being blocked 🙁 ? even the needed sites like facebook and hotmail etc..

    Stephan

    Stephan · April 21, 2018 at 7:23 am

    @Basil: No, just the ones, that are in the blacklist.

vinay · April 25, 2018 at 11:19 am

really awesome work. I really appreciate your efforts. Please do updating. i want to follow you. Kindly provide your FB or twitter ID..

    Stephan

    Stephan · April 25, 2018 at 2:28 pm

    Check the sidebar.

Lloyd Sommerer · May 3, 2018 at 4:57 am

Thnk you. This hsa been most helpful in setting up safe search at my school. One note, there is an extra space after the period that might throw someone off:
server:
include: /var/unbound/google. conf

    Stephan

    Stephan · May 5, 2018 at 7:08 am

    Thanks! I’ll fix it.

Seth · May 3, 2018 at 4:37 pm

Hi Stephan! I tried this setup but other https websites are blocked why is that though when I enable SSL Filtering on Squid Proxy Server. How you can help!

    Stephan

    Stephan · May 5, 2018 at 7:07 am

    Maybe the other https Websites are in the blacklist? You can whitelist them like the google domains.

      Adam · May 14, 2018 at 9:28 pm

      Much like Seth, all https traffic appears to be blocked in this configuration for me as well. I have my sites whitelisted but to no avail in https. It works fine with http though. Any ideas?

      With that being said, My state’s laws says schools MUST filter traffic in schools. Furthermore, the school owns all traffic in the network as it is guided by a legal AUP. I am not sure how other states do this but it is legal to do the conscious MIM attack for our purposes. I do side where conscious MIM attacks could be a security breach, keeping kids safe is also an important role as well. My school already has a commercial system that does this in fact. While I am not trying to open a debate on this at all, I am merely trying to lockdown my students internet during testing times to curb the possibility of cheating. We use Cisco Netacad for this which is on amazon AWS. There are many URL’s and writing a simple router ACL would be a pain due to the complexity of our setup. Any input/guides on the Conscious MIM setup?

Klaus · May 10, 2018 at 10:18 am

Hi, i filter the https all right but when i try access to youtube enter but filter the content and i not filter safesearch, i create user exclude of the filter in the proxy but same i can’t access to the content on youtube.

Aaron Wagner · May 12, 2018 at 6:53 am

We used WPAD to configure the proxy settings on the PCs. Tbis allowed us to filter https sites without MITM.

The http://www.shallalist.de site seems to be down right now.

Keith Winn · May 14, 2018 at 7:24 am

You are the man!!! This is awesome. Just did this to my firewall at my school and I love it. It will save me a lot of money as I was looking at several DNS services to solve this problem. This will save me some $$$ and be one less extra service that I have to manage. By the way, I like your other posts too. Keep up the good work.
It did not work the first time, I had to restart all the proxy services, but after that it is working like a charm. Thanks again.

Comments are closed.