logo       
Google Custom Search
    AddThis Social Bookmark Button

Re: Wiki spam and the future of phpwiki: msg#00077

Subject: Re: Wiki spam and the future of phpwiki
I tried to send this, and Sourceforge bounced it. Now, a week later, let me try again.

Dan

YOD,

90% of the code I took from editpage.php in CVS current of PhpWiki, thanks to Reini. The 10% was two changes:

1. Modify the end-user error message to say "Sorry, too many links (more than ##)" instead of "conflicts".
2. Add a configurable parameter SPAM_MAX_EXTERNAL_LINKS.

Below is an approximate patch to editpage.php. Since we are almost a year different from Phpwiki, I cannot guarantee that the patch is entirely accurate.

Also, I sent this patch to this list awhile ago. Perhaps I should also change it in CVS current?

Dan


(YOD) wrote:

On Wed, 30 Mar 2005, Dan Frankowski wrote:
1. Don't allow saving a page that has more than 20 external ("http://";) links. In our code, I modified "20" to be a configurable parameter SPAM_MAX_EXTERNAL_LINKS. We've been completely spammed as well, and I believe this will help us a lot. We have a wiki where each legitimate page only contains a few external links, but spam pages contain tons (>50 for sure) external links.

I'd like to implement this modification. Would you be kind enough to send
it to me?
I have three working PhpWikis but two (with a progressive
political orientation) of them I have only open blogging but otherwise
editing is now turned off due to excessive spamming. I just installed a
third Wiki to play with and haven't decided yet what to do with it.

I am just waiting to see how long it takes for spam to find the new open Wiki and I suspect it won't be long (g). I also want to explore other methods for securing Wiki from spambots (intentional spammers are much easier to deal with) so I have been following these discussions with much interest.

Hank Roth






-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/Info/Sentarus/hamr30
_______________________________________________
Phpwiki-talk mailing list
Phpwiki-talk@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/phpwiki-talk

Index: editpage.php
===================================================================
RCS file: .../phpwiki/lib/editpage.php,v
retrieving revision 1.16
retrieving revision 1.17
diff -b -u -r1.16 -r1.17
--- editpage.php        14 Feb 2005 05:28:29 -0000      1.16
+++ editpage.php        15 Mar 2005 17:59:02 -0000      1.17
@@ -1,5 +1,5 @@
<?php
-rcs_id('$Id: editpage.php,v 1.16 2005/02/14 05:28:29 syilek Exp $');
+rcs_id('$Id: editpage.php,v 1.17 2005/03/15 17:59:02 dfrankow Exp $');

require_once('lib/Template.php');

@@ -134,10 +134,13 @@
            // output, and update the version
            $this->_content = implode ("\n", $output);
            $this->_currentVersion = $this->current->getVersion();
-            $this->version = $this->_currentVersion;
            $unresolved = $diff->ConflictingBlocks;
+            if ($this->version != $this->_currentVersion) {
+                // saveFailed because of conflicting edits
+                $this->version = $this->_currentVersion;
            $tokens['CONCURRENT_UPDATE_MESSAGE'] = $this->getConflictMessage($u
nresolved);
        }
+        }

        if ($this->editaction == 'preview')
            $tokens['PREVIEW_CONTENT'] = $this->getPreview(); // FIXME: convert
to _MESSAGE?
@@ -288,6 +291,20 @@
            return true;
        }

+        if ($this->isSpam()) {
+            return false;
+            /*
+            // Save failed. No changes made.
+            $this->_redirectToBrowsePage();
+            // user will probably not see the rest of this...
+            include_once('lib/display.php');
+            // force browse of current version:
+            $request->setArg('version', false);
+            displayPage($request, 'nochanges');
+            return true;
+            */
+        }
+
        $page = &$this->page;

        // Include any meta-data from original page version which
@@ -375,6 +392,70 @@
        return $this->_content == $current->getPackedContent();
    }

+    /**
+     * Handle AntiSpam here. How? http://wikiblacklist.blogspot.com/
+     * Need to check dynamically some blacklist wikipage settings
+     * (plugin WikiAccessRestrictions) and some static blacklist.
+     * DONE:
+     *   More then 20 new external links
+     *   content patterns by babycart (only php >= 4.3 for now)
+     * TODO:
+     *   IP blacklist
+     *   domain blacklist
+     *   url patterns
+     */
+    function isSpam () {
+        $current = &$this->current;
+        $request = &$this->request;
+
+        $oldtext = $current->getPackedContent();
+        $newtext =& $this->_content;
+        // 1. Not more then 20 new external links
+        if (!defined('SPAM_MAX_EXTERNAL_LINKS')) define('SPAM_MAX_EXTERNAL_LINK
S', 20);
+        if ($this->numLinks($newtext) - $this->numLinks($oldtext) >= SPAM_MAX_E
XTERNAL_LINKS) {
+            // mail the admin?
+            $this->tokens['PAGE_LOCKED_MESSAGE'] =
+                HTML($this->getSpamMessage(),
+                     HTML::p(HTML::em(sprintf(_("Too many external links (more
than %d)."), SPAM_MAX_EXTERNAL_LINKS))));
+            return true;
+        }
+        // 2. external babycart (SpamAssassin) check
+        // This will probably prevent from discussing --- or ---- related top
ics. So beware. (---s for SourceForge email scanner.)
+        if (defined('ENABLE_SPAMASSASSIN') && ENABLE_SPAMASSASSIN) {
+            $user = $request->getUser();
+            include_once("lib/spam_babycart.php");
+            if ($babycart = check_babycart($newtext, $request->get("REMOTE_ADDR
"
+),
+                                           $user->getId())) {
+                // mail the admin?
+                if (is_array($babycart))
+                    $this->tokens['PAGE_LOCKED_MESSAGE'] =
+                        HTML($this->getSpamMessage(),
+                             HTML::p(HTML::em(_("SpamAssassin reports: ",
+                                                join("\n", $babycart)))));
+                return true;
+            }
+        }
+        return false;
+    }
+
+    /** Number of external links in the wikitext
+     */
+    function numLinks(&$text) {
+        return substr_count($text, "http://";);
+    }
+
+    /** Header of the Anti Spam message
+     */
+    function getSpamMessage () {
+        return
+            HTML(HTML::h2(_("Spam Prevention")),
+                 HTML::p(_("This page edit seems to contain spam and was theref
ore not saved."),
+                         HTML::br(),
+                         _("Sorry for the inconvenience.")),
+                 HTML::p(""));
+    }
+
    function getPreview () {
        include_once('lib/PageType.php');
        $this->_content = $this->getContent();
@@ -691,6 +772,10 @@

/**
 $Log: editpage.php,v $
+ Revision 1.17  2005/03/15 17:59:02  dfrankow
+ Check in an anti-spam guard from PhpWiki CVS current: more than 20
+ external (http://) links means you can't save.
+ 

Try Searching:
servers, voip, java, networking, microsoft ...
<Prev in Thread] Current Thread [Next in Thread>