(FULL DISCLOSURE: Despite the forum tag I have basically been retired since May and have zero voice in the governance or management of the forums here. Please note that I am sharing a personal perspective, not dictating policy. Also, this is the first time I've actually browsed the forum for more than two minutes at a time since May, so my observations may involve superficial data. The blowhardiness of the post of course is the usual ration you'll get whenever I say something. =))
Spam Preventor's Maxim: A given spamming operation will figure out a way around almost any given human verification step added within 3-6 months of its implementation, if not sooner. This includes human verification questions tuned towards specialist questions. The problem here isn't that the admins haven't figured out the perfect cocktail of flaming hoops to make new registrations jump through; the problem is that the solutions all have shelf lives and the site doesn't have an admin who is focused primarily on the forum versus being involved in eleven other FTB core projects that require higher-level attention. Things slip through the cracks in volunteer organizations because you can't just hand out all the keys to the city and every keyholder is spread among general task lists.
On the human verification question suggestion: Any question sufficiently simple enough for this site's target audience to reply to correctly can be answered with a google search. Anything obscure enough to actually keep the bots fooled would keep far more legitimate users fooled as well and would be counterproductive to the purpose of having a forum. The only way to make this work, in the long term, is to have somebody refresh the human verification questions every time the cipher has been cracked. This presents a problem in this community, though, in that there's only a very finite amount of information one can expect a new member to know and answer with correctly and without research; once you've cycled through that information, you've run out of questions.
(And yes, I can confirm that they will get the answers on these. One of my other hats has been as the primary administrator of a rather large RPG site, where I once attempted to curtail bot activity by putting in human verification questions that required actual research in the site's lore compendium; if they're going to look up and catalog in their answer sheet who the fifth Empress of a second dynasty that reigned 700 years ago in a fantasy world was, they will definitely find Minecraft trivia a much simpler nut to crack. Once a solution has been found it gets propagated among the bot script-writers, and they're right back at it.)
When I used to do the rounds on this, I was quite literally eyeballs-on-site from about 10pm to around 4am if I wanted to whack all the moles before anybody really noticed. That's not a sustainable pace to expect unpaid volunteers to do in the long haul, which is why that often slipped and why other background solutions were supplied. An interesting note to make as I write this: even with the ones getting through, there are 90 new threads that the system has trapped tonight in queue. So the system is filtering out a lot of crap, it just needs to do better.
One thing to note is that the spam currently coming through is of a particularly challenging type to curtail. Site SEO-mongering spam needs to include hyperlinks, so a simple bar on those posts is to limit who can hyperlink (another example of the adaptations the bot-writers make: I've noticed that in this war that a number of the bots now have been programmed to the extent that they make an initial innocuous post to get past the queue filter and then edit in the preferred content; there probably should be a restriction set on edit permissions for new users for this reason, as obnoxious as it may be). These ones, however, are a different breed of spam: the search result blanketer. This is distinguished from the Site SEO spammer, whose goal is to increase linkbacks and hook an occasional lackadaisical click-through. Search result blanketers just want a particular term set and/or set of real-life references (addresses, phone numbers) embedded throughout the web in search results.
This set is particularly difficult because they're in Korean and Chinese. Another layer of the spam filtration process besides link restriction here is (or at least was) a list of keywords that were common to much of the spam being posted. It's not that easy to come up with a keyword list for these languages for a lot of reasons (and I could go into this at length but I don't think anybody really cares about the quirks of East Asian ideographic linguistics) and so the filtration is difficult to work on that dimension. Even the old Indian "black magic love" spam was easier to target than these are.
Anyway, that's a lot of trivia. What it boils down to is that a rounded solution involves several steps and that all of these require some amount of attention and manpower applied to them:
- Registration filtration through human verification and other steps (for example, I've personally found it useful elsewhere to simply ban a number of foreign+free e-mail hosts or set restrictions on that basis from registration).
- Access curtailment for suspect/new accounts (longer between-new-thread post timers, restrictions on editing, other tricks that stymie the bot's ability to succeed).
- Content filtration via URL and keyword blacklists for content from new accounts (requiring a minimal post count before permitting unqualified URL-inclusive posts, barring very common terms that aren't applicable to the forum and pretty much only get used commercially).
- Human intervention to catch the ones that slip through the cracks and to further refine the previous three processes.
Please keep in mind that only #4 is the actual domain of moderators, and only to the extent of those currently with the tools doing what they can; the others are admin domain concerns, including adding more people for #4. When you rag on moderators to deal with the first three things, you will get absolutely nowhere, because
they can't do anything about it besides pass along feedback just like you already can.
Beefing up #4 can be useful but can also be problematic. You have to strike a balance on who you give tools to in an environment like this. I'm totally a fan of the suggestion of lifting the report cooldown from basically all user categories except the spam suspect category, as long as it's also made clear that anybody who goes hogwild in an abusive manner is basically going to end up banned for it (I think during my tenure I banned two people for this kind of behavior, so yes, it does happen). I'm hesitant about expanding the spam cleaner group too far because you are giving the power of banning new accounts to new spam cleaners; somebody in that group going rogue can seriously damage the forum's reputation, so you have to take care on handing out that tool. 10x that concern for additional moderators (and with moderators you also have to strike a balance of day-to-day workload; if you select a moderator cadre to fill in at the level of an every-six-month spike, you're going to have a lot of bored and useless feeling moderators who just wander off and won't handle the spike anyway; you need to keep a moderation team lean enough that there's something they can always be working on and understand that occasional effluent piles will temporarily hit the forum).
There. I've iterated entirely too much and it's probably a giant wreck with little coherence. There is obviously a problem, there is more than can be done, and suggestions and feedback on it being a continuing problem are valuable; but please make sure the tree you're barking up is the correct one, though, otherwise it's just going to become a big circle of frustration in an already frustrating situation (and it is very, very frustrating for the moderators on this too, I will iterate again from my own experience).
(On that note, shouldn't this thread get located to Web Feedback, perhaps? =))