Recently I’ve been getting hammered with blog spam on www.JoeOn.net. My blog mirror blogs.msdn.com/joestagner runs on Community Server which does a better job of spam blocking but I don’t want to switch www.JoeOn.net to Community Server because it is a Commercial product (Sorry Rob) and I want to ability to code additions to meet my needs and make my code available to anyone that wants to use it. (Don’t get me wrong, community server ROCKS and I will be using it in a new On-Line community.)
I also really like DasBlog but stumbled when trying to migrate my blog content from .TEXT to DasBlog.
So I needed to find a way to solve the spam problem with my existing .TEXT setup, at least for the mean time.
A little research revealed this post by John Sample -THANKS John.
Original Post http://www.johnsample.com/articles/DotTextSpamBlock.aspx
Basically, it ads a table to the .TEXT SQL Server instance that is used to contain a list of “spam words”. If the post contains the spam word it is disallowed.
Where is how I’m using mine.
I’m primarily adding portions of the domain names that the spammers, who are mostly on line gambeling sites, are using. They are mostly free sites so this won’t be a problem with any valid perminant domains being referenced in a post.
Here is the SQL that creates the table.
IF NOT EXISTS (select * from dbo.sysobjects where id = object_id(N'[dbo].[FILTER_WORD]') and OBJECTPROPERTY(id, N'IsUserTable') = 1) BEGINCREATE TABLE [FILTER_WORD] ( [fw_ID] [int] IDENTITY (1, 1) NOT NULL , [fw_WORD] [varchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , CONSTRAINT [PK_FILTER_WORD] PRIMARY KEY CLUSTERED ( [fw_ID] ) ON [PRIMARY] ) ON [PRIMARY]ENDGO
Once the table is installed I just used Microsoft Query Analyzer to add the words to the FILTER_WORD table.
Next we I needed to alter the stored procedure – [blog_InsertEntry] as follows. [Download Here]
ALTER Proc blog_InsertEntry( @Title nvarchar(255), @TitleUrl nvarchar(255), @Text ntext, @SourceUrl nvarchar(200), @PostType int, @Author nvarchar(50), @Email nvarchar(50), @SourceName nvarchar(200), @Description nvarchar(500), @BlogID int, @DateAdded datetime, @ParentID int, @PostConfig int, @EntryName nvarchar(150), @ID int output)as
If(@EntryName is not null)Begin if exists(Select EntryName From blog_Content where BlogID = @BlogID and EntryName = @EntryName) Begin RAISERROR('The EntryName you entry is already in use with in this Blog. Please pick a unique EntryName.',11,1) RETURN 1 EndEndif(Ltrim(Rtrim(@Description)) = '')Begin set @Description = nullEnd
If (@PostType = 3 or @PostType = 4) Begin declare @temp_text varchar(8000)
set @temp_text = UPPER(isnull(CAST(@Text as varchar(7500)),'')) + UPPER(isnull(@SourceName,'')) + Upper(isnull(@TitleUrl,'')) + Upper(isnull(@Author,'')) declare @word_buffer varchar(100) declare filterCursor CURSOR LOCAL FAST_FORWARD for select UPPER(fw_WORD) from FILTER_WORD OPEN filterCursor FETCH NEXT FROM filterCursor INTO @word_buffer WHILE @@FETCH_STATUS = 0 BEGIN IF (CHARINDEX(@word_buffer,@temp_text) > 0) Begin CLOSE filterCursor DEALLOCATE filterCursor RAISERROR('Disallowed words or phrases detected. I you feel this message is in error, please contact blog owner.',11,1) RETURN 1 End FETCH NEXT FROM filterCursor INTO @word_buffer END CLOSE filterCursor DEALLOCATE filterCursor End
INSERT INTO blog_Content(Title, TitleUrl, [Text], SourceUrl, PostType, Author, Email, DateAdded,DateUpdated, SourceName, [Description], PostConfig, ParentID, BlogID, EntryName )VALUES (@Title, @TitleUrl, @Text, @SourceUrl, @PostType, @Author, @Email, @DateAdded, @DateAdded, @SourceName, @Description, @PostConfig, @ParentID, @BlogID, @EntryName)Select @ID = @@Identity
if(@PostType = 1 or @PostType = 2)Begin exec blog_UpdateConfigUpdateTime @blogID, @DateAddedEndElse if(@PostType = 3)Begin Update blog_Content Set FeedBackCount = FeedBackCount + 1 where [ID] = @ParentIDEnd
I also added the TrackBack filter. In addition to blocking based on words this update lets you add IP address to the filter file. To do this we need to modify the stored procedure [blog_InsertPingTrackEntry] as below. [Download Here]
ALTER Proc blog_InsertPingTrackEntry( @Title nvarchar(255), @TitleUrl nvarchar(255), @Text ntext, @SourceUrl nvarchar(200), @PostType int, @Author nvarchar(50), @Email nvarchar(50), @SourceName nvarchar(200), @Description nvarchar(500), @BlogID int, @DateAdded datetime, @ParentID int, @PostConfig int, @EntryName nvarchar(150), @ID int output)as
Set @ID = -1
if not exists (Select [ID] From blog_Content where TitleUrl = @TitleUrl and ParentID = @ParentID)Begin if(Ltrim(Rtrim(@Description)) = '') set @Description = null
declare @temp_text varchar(8000)
set @temp_text = UPPER(isnull(CAST(@Text as varchar(7500)),'')) + UPPER(isnull(@SourceName,'')) + Upper(isnull(@TitleUrl,'')) + Upper(isnull(@Author,'')) + Upper(isnull(@Title,'')) declare @word_buffer varchar(100) declare filterCursor CURSOR LOCAL FAST_FORWARD for select UPPER(fw_WORD) from FILTER_WORD OPEN filterCursor FETCH NEXT FROM filterCursor INTO @word_buffer WHILE @@FETCH_STATUS = 0 BEGIN IF (CHARINDEX(@word_buffer,@temp_text) > 0) Begin CLOSE filterCursor DEALLOCATE filterCursor --RAISERROR('Disallowed words or phrases detected. I you feel this message is in error, please contact blog owner.',11,1) RETURN 1
End FETCH NEXT FROM filterCursor INTO @word_buffer ENDCLOSE filterCursorDEALLOCATE filterCursor
INSERT INTO blog_Content( PostConfig, Title, TitleUrl, [Text], SourceUrl, PostType, Author, Email, DateAdded,DateUpdated, SourceName, [Description], ParentID, BlogID)VALUES(@PostConfig, @Title, @TitleUrl, @Text, @SourceUrl, @PostType, @Author, @Email, @DateAdded, @DateAdded, @SourceName, @Description, @ParentID, @BlogID)
Select @ID = @@Identity
Update blog_ContentSet FeedBackCount = FeedBackCount + 1 where [ID] = @ParentID
End
— SO far it looks to have reduced my blog spam by 95%