Server — Mail — DCC
The previous two articles covered Razor and Pyzor: collaborative spam detection tools that check whether a message’s signature matches known spam in a shared database. DCC, the Distributed Checksum Clearinghouse, takes a meaningfully different approach. Understanding the distinction before installing anything makes the configuration decisions sensible rather than arbitrary.
How DCC differs from Razor and Pyzor
Razor and Pyzor ask: has this specific message been reported as spam? They compare a message’s hash against a database of known-bad messages. A hit means this exact content, or something very close to it, has been identified as spam by someone in the network.
DCC asks a different question: how many times has this message been seen? It computes fuzzy checksums of the message body and sends them to a DCC server, which returns the count of how many times those checksums have been reported. It does not care whether those reports came from spam or ham. It simply counts.
The insight behind DCC is that most spam is bulk: the same message, or a slight variation of it, sent to enormous numbers of recipients simultaneously. An individual email, even one you receive many times, will have a low count because it was only sent to you. A spam campaign sending a million copies of the same message will have a very high count because those checksums are being reported from all over the internet simultaneously.
DCC flags mail that is bulk, not mail that is known-bad. Those two categories overlap heavily for spam, but they are not the same thing. This is why DCC is particularly effective against new spam campaigns: the moment a bulk send starts, the counts rise rapidly across the DCC network, regardless of whether anyone has explicitly reported those messages as spam. Razor and Pyzor need a report; DCC needs only volume.
The flip side is that legitimate bulk mail — newsletters, mailing lists, notifications sent to many recipients — also generates high DCC counts. DCC without whitelisting will flag mailing lists as bulk mail, because they are. The whitelist configuration is not optional for a usable setup.
Licence
DCC is not open source in the conventional sense. The client software is freely available and free to use for personal and non-commercial purposes, but the licence restricts commercial use and redistribution. This is why SpamAssassin ships with the DCC plugin commented out in v310.pre with an explicit note that it is not open source.
For a personal homelab mail server, the licence terms are not a concern. For any deployment handling mail for a business or a significant number of users, the licence should be read carefully before deploying. The full text is at https://www.dcc-servers.net/dcc/LICENSE.
Installation
DCC is not in Ubuntu’s standard repository. It must be built from source. This is more involved than Razor or Pyzor but not difficult.
Install the build dependencies:
sudo apt install build-essential wget
Download the latest source from the DCC website. Check https://www.dcc-servers.net/dcc/ for the current version number and substitute it below:
cd /tmp
wget https://www.dcc-servers.net/dcc/source/dcc.tar.Z
tar xf dcc.tar.Z
cd dcc-*
Configure and build:
./configure
make
sudo make install
The default installation places the DCC binaries in /usr/local/bin and the DCC home directory in /var/dcc. Confirm the installation completed:
ls /var/dcc
/usr/local/bin/dccproc --version
The /var/dcc directory should contain configuration files and the DCC map. The dccproc binary is what SpamAssassin calls to perform DCC checks.
Configuring dccifd
DCC provides two interfaces for SpamAssassin: dccproc, which forks a new process for each message, and dccifd, a persistent daemon that handles checks more efficiently. For a low-volume personal mail server, dccproc is simpler to configure and adequate. For a higher-volume server, dccifd is preferable.
This article uses dccproc for simplicity. The SpamAssassin configuration points directly at the binary.
Firewall
DCC uses UDP port 6277 to communicate with its servers. February’s UFW configuration allows all outbound connections by default. If outbound UDP is restricted, allow it explicitly:
sudo ufw allow out 6277/udp
Checking DCC server connectivity
Confirm DCC can reach its servers before configuring SpamAssassin:
/usr/local/bin/cdcc info
This queries the DCC server map and displays the available servers and their response times. You should see a list of servers with millisecond response times. If all servers show timeouts, check the firewall and try again after a minute; DCC servers occasionally have brief unavailability.
Whitelisting legitimate bulk mail
Before enabling DCC, configure a whitelist. Without it, any mailing list, newsletter, or bulk notification you legitimately receive will score as spam.
The whitelist file lives at /var/dcc/whiteclnt. Each line whitelists a sender, recipient, or other attribute. The format is documented in the dcc man page; for personal use the most practical entries are whitelisting by envelope sender or recipient.
To whitelist all mail addressed to a specific recipient (useful if you subscribe to mailing lists at a specific address):
ok rcpt you+lists@yourdomain.com
To whitelist a specific sending domain:
ok env_from *@newsletter.example.com
To whitelist a mailing list by its List-ID header:
ok substitute hdr list-id <listname.lists.example.com>
The whitelist is read by dccproc on each invocation. Changes take effect immediately without restarting anything.
Build the map after editing:
sudo /usr/local/bin/cdcc "new map"
Enabling DCC in SpamAssassin
Uncomment the plugin
Find and uncomment the DCC plugin in /etc/spamassassin/v310.pre:
# Before:
# loadplugin Mail::SpamAssassin::Plugin::DCC
# After:
loadplugin Mail::SpamAssassin::Plugin::DCC
Configure the plugin
Add DCC configuration to /etc/spamassassin/local.cf:
# DCC configuration
use_dcc 1
dcc_home /var/dcc
dcc_path /usr/local/bin/dccproc
dcc_timeout 10
dcc_body_max 999999
dcc_fuz1_max 999999
dcc_fuz2_max 999999
# DCC scoring
score DCC_CHECK 4.0
add_header all DCC _DCCB_: _DCCR_
dcc_home points to the DCC home directory where server maps and configuration are stored.
dcc_path points to the dccproc binary. Confirm this matches the actual installation path with which dccproc.
dcc_timeout 10 sets the maximum wait time for a DCC response. Ten seconds is generous but prevents DCC from blocking mail processing if the servers are slow.
dcc_body_max, dcc_fuz1_max, dcc_fuz2_max set the count thresholds at which DCC triggers for the body checksum and two fuzzy checksum variants. 999999 is DCC’s special MANY value, meaning the check fires when the count is extremely high. Starting with MANY is conservative: it catches only unambiguous bulk mail and avoids false positives while you tune the whitelist.
Once the whitelist is established and you have confirmed no false positives, lower these thresholds to catch more spam. Values of 100 to 1000 are common for established servers.
score DCC_CHECK 4.0 adds 4 points to the SpamAssassin score on a DCC hit. This is higher than the Razor and Pyzor scores because DCC hits at the MANY threshold are essentially certain bulk mail. Combined with Razor and Pyzor scores, a message hitting all three collaborative checks will score well above any reasonable spam threshold.
add_header all DCC _DCCB_: _DCCR_ adds DCC result information to the message headers, which is useful for diagnosing why a message was or was not flagged.
Restarting SpamAssassin
sudo systemctl restart spamd
Verify the DCC plugin loaded:
sudo -u spamd spamassassin --lint 2>&1 | grep -i dcc
Testing
Test DCC against SpamAssassin’s sample spam file:
sudo -u spamd spamassassin -t -D dcc < /usr/share/doc/spamassassin/sample-spam.txt 2>&1 | grep -i dcc
Watch for lines showing DCC connecting and returning a count:
dcc: DCC is available: /usr/local/bin/dccproc
dcc: got response: X-DCC-Rhyolite-Metrics: ... Body=many Fuz1=many Fuz2=many
A response of many for the body and fuzzy checksums confirms DCC is working and the sample spam is known to the network. A response of 1 means the message is not considered bulk at this moment, which can happen with the sample spam file depending on how recently it has been seen.
To test with a real spam message from your mail log:
sudo -u spamd spamassassin -t -D dcc < /path/to/spam.eml 2>&1 | grep -E "dcc:|DCC"
The three together
With Razor, Pyzor, and DCC all active, SpamAssassin is running three complementary collaborative checks on each message:
Razor checks whether this message has been explicitly reported as spam by the Cloudmark network. Pyzor checks whether this hash appears in the public Pyzor database. DCC checks whether this message is being sent in bulk volume across the internet.
A message that hits all three is almost certainly spam. A message that hits none of them may still be caught by SpamAssassin’s heuristic rules or Bayesian filter. The three systems overlap in coverage but draw from different data sources and catch different categories of spam, which is why running all three is worth the setup effort.
Ongoing maintenance
DCC updates itself automatically when the updatedcc script runs. Add a weekly cron job to keep the installation current:
sudo crontab -e
Add:
0 3 * * 0 /var/dcc/libexec/updatedcc
This runs the update script at 3am every Sunday. The script downloads and installs new DCC versions automatically if one is available.