Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
More thoughts on filtering software

By PrettyBoyTim in News
Wed May 17, 2000 at 05:42:29 PM EST
Tags: Internet (all tags)
Internet

I've been doing a little thinking about web filtering software after having read the recent k5 article about it, and I think there is a reasonable case for having a decent system for web filtering in some situations.

Ideally it should be open, and devoid of any inbuilt bias against certain types of website. It should also be built by the community who use it.

In the rest of the article I put forward the beginnings of an idea I've been thinking about for implementing such a system.


My current idea is this:

  • Filtering would be accomplished by downloading a list of sites from servers distributed around the internet. You'd probably update your site list every week or so.
  • Whenever you (the person setting up the filtering) came across a site you didn't feel happy with, you'd give it a value for how unhappy you were with it.
  • Any moderations that you make get downloaded to your server and added to the general list, and gets used to work out filtering for other people.

Now, this is where it gets tricky. Ideally you want a situation where the server compares the moderations that you have made with the moderations that other people have made, and works out which people moderate pages in rougly the same way as you. It then uses the values that they have used to filter your pages.

Of course, this is all very easy to say, but probably rather tricky to implement. I'm not quite sure how you'd do it. I can see that you might for instance, rate very similarly with someone on porn sites, but rate quite differently on sites that feature violence, or something like that. You'd want to be able to make use of the moderation they have done on porn, but ignore their moderations on violence.

One solution is to have people moderate pages into categories, but that's rather open to differences in interpretation. I do vaguely remember from my course on neural nets at Uni that there is such a thing as a Kohonen Feature map - a kind of net that is very good at sorting things into categories - perhaps it could help? Any neural net experts out there? Anyway, I'm sure it's not an insurmountable problem.

One of the main advantages of a system of this kind is that it encompasses all kinds of viewpoint. If there's enough people out there who want to filter in a similar way to you, it could help you, irregardless of your views. If you want to censor hate sites and sites featuring extreme violence from your seven year old, fine. If you happen to want to stop your kids reading anything about the gay rights movement or evolution, er..., well, that's your call I suppose, but at least you won't get in each others way...

Thoughts?

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o recent k5 article about it
o Also by PrettyBoyTim


Display: Sort:
More thoughts on filtering software | 29 comments (29 topical, editorial, 0 hidden)
I can imagine it. ... (1.00 / 1) (#3)
by eann on Wed May 17, 2000 at 02:07:47 PM EST

eann voted 1 on this story.

I can imagine it.

Now go do it. :)

Our scientific power has outrun our spiritual power. We have guided missiles and misguided men. —MLK

$email =~ s/0/o/; # The K5 cabal is out to get you.


This is basically how current filte... (3.00 / 1) (#1)
by rusty on Wed May 17, 2000 at 02:14:32 PM EST

rusty voted 1 on this story.

This is basically how current filtering software works, except that the people doing the rating are employees of FilterCo Inc. Your idea would be workable, I think, except for the big ugly fact that the vast majority of parents are not hackers, and simply don't have time to do this. They want a plug 'n' play, "Billy, don't look at those websites your father talked to you about" solution, not to spend all their time scouring the web for pr0n. So the basic flaw is, you want community filtering, but who's your community?

____
Not the real rusty

Hmm.. somehow I see my child voting... (4.00 / 2) (#13)
by slycer on Wed May 17, 2000 at 02:22:00 PM EST

slycer voted 1 on this story.

Hmm.. somehow I see my child voting "show me it" on any site that I have voted the other way. By implementing this there is the potential (as with anything) for abuse. So then we'd have to make something to stop them from voting, and something to stop them from disabling the thing that stops them from voting....

IMO the best way to keep these sites from your young children is to surf with them. When they hit the teens, well, not much is going to stop them from seeing what they want - either internet or otherwise. I know I saw my first pr0n movie in the seventh grade, we laughed and laughed.

I'd like to hear what people have t... (2.00 / 1) (#15)
by jbeimler on Wed May 17, 2000 at 02:24:51 PM EST

jbeimler voted 1 on this story.

I'd like to hear what people have to say about this. I don't much care for censorship, but I do understand companies liabilities in letting employees surf anywhere. I personally do not want to see pr0n at work.

Very interesting ideas. They start... (3.00 / 1) (#6)
by Noel on Wed May 17, 2000 at 03:00:52 PM EST

Noel voted 1 on this story.

Very interesting ideas. They start to address one of the most difficult issues: can you trust anyone to control or influence what information is available to you without giving up complete control? It might be possible, but it's sure not easy.

I've been playing with the "community of moderators'" idea also, and you're right, the most difficult question is how to pick the moderators that are most likely to agree with the user. In many cases, the user will not want to spend much time rating sites, and may not even want to be exposed to the type of sites that the filtering software is supposed to block. It might be very difficult to get a user to rate enough sites to effectively compare their ratings to those of the other moderatiors.

My suspicion is that the solution lies with masses of moderators, maybe a little like that other site uses... [grin]

#insert Earth[0] reference here. ... (1.00 / 1) (#14)
by genehack on Wed May 17, 2000 at 03:25:00 PM EST

genehack voted 1 on this story.

#insert Earth[0] reference here.

[0] the David Brin novel.

This is a very interesting idea. I... (3.00 / 1) (#7)
by puppet10 on Wed May 17, 2000 at 03:33:52 PM EST

puppet10 voted 1 on this story.

This is a very interesting idea. If any of you have been to MovieLens, they have a mathmatical system for choosing movies you like based on prior expressed preferences by matching you to other users using a mathmatical model. A contnet filtering program based on the opposite of this would be kind of cool (not that I'd use it, but some people seem intent on useing something like this no matter what)

People need to take responsibility ... (3.00 / 1) (#12)
by Rasputin on Wed May 17, 2000 at 03:54:42 PM EST

Rasputin voted 1 on this story.

People need to take responsibility for their own choices and actions. If you don't want to see porn or hate pages or whatever then don't go there. If you don't want your kids viewing these sites then you should be supervising them instead of hoping that the screening software will stop them from seeing somebody get up close and personal with a web cam.

I think filtering software is, in general, a non-workable idea, right up there with banning and burning books to keep the children from being exposed to dangerous ideas. No matter what you try to filter, someone will find a way around it. As well, it's almost guaranteed that someone will try to use (as is being done already) this type of software to inflict a personal morality view on the wider population.

I realize constant monitoring is, at best extremely difficult and at worst effectively impossible. That's why you have to teach your children to the best of your ability. Hopefully the combination of monitoring and teaching will be enough to prevent things from going to far until the kids are old enough to be responsible for their own mistakes.
Even if you win the rat race, you're still a rat.

Re: People need to take responsibility ... (none / 0) (#23)
by PrettyBoyTim on Thu May 18, 2000 at 06:09:32 AM EST

However, the internet is becoming increasingly pervasive... I can see a time in the very near future when a lot of homes will have internet connections in most rooms - and it's good for kids to undertake some unsupervised learning and exploration. I'd like to be able to have a system where I could be reasonably sure that they wouldn't come across some types of material, while still being able to explore what is out there for them. I don't really see the point once I child hits puberty, but before that I'd think there are some things you wouldn't want them to see.

[ Parent ]
It's an interesting topic, but I do... (2.00 / 1) (#11)
by DemiGodez on Wed May 17, 2000 at 03:59:23 PM EST

DemiGodez voted 0 on this story.

It's an interesting topic, but I don't think it is a very good idea. Most people who want filters are advocating/using them either to a) keep bad stuff from kids or b) keep employees on task. Most adult people don't use filters when they browse the net themselves.

In general, personalizaing content is good and this is more personalization than filtering.

These are interesting ideas, and wo... (2.00 / 1) (#5)
by dlc on Wed May 17, 2000 at 04:04:59 PM EST

dlc voted 1 on this story.

These are interesting ideas, and worth discussing. However, I fear for the lifespan of anything -- idea, product, whatever -- that requires active participation from the majority of its users/adherents.


(darren)

ideas, conversation starters... loo... (1.00 / 1) (#4)
by confidential on Wed May 17, 2000 at 04:12:37 PM EST

confidential voted 1 on this story.

ideas, conversation starters... looks good to me

Interesting, I have a very differen... (3.00 / 1) (#16)
by ZamZ on Wed May 17, 2000 at 04:18:11 PM EST

ZamZ voted 1 on this story.

Interesting, I have a very different view of the concept put in that context. One the technical tip, I just don't think AI has got there yet. On the censorship point, it would depend greatly on how easy it is to opt out of it. Actually the whole idea of a community moderated but not censored web sounds interesting, especially as part of a search engine.

I like the idea, but I would do thi... (2.00 / 1) (#9)
by RobotSlave on Wed May 17, 2000 at 04:26:58 PM EST

RobotSlave voted 1 on this story.

I like the idea, but I would do things differently. If this idea goes anywhere, I'll express my ideas in code rather than human language :).

Nicely written article, well argued... (1.00 / 1) (#2)
by alisdair on Wed May 17, 2000 at 04:43:22 PM EST

alisdair voted 0 on this story.

Nicely written article, well argued (+1) but internet censorship is an unnecessary evil (-1).

> inbuilt bias against certain type... (1.00 / 1) (#10)
by hooty on Wed May 17, 2000 at 04:54:40 PM EST

hooty voted 1 on this story.

> inbuilt bias against certain types of website I think a filter, by definition, has this sort of bias.

Don your asbestos underoos and get ... (2.50 / 2) (#8)
by warpeightbot on Wed May 17, 2000 at 05:19:38 PM EST

warpeightbot voted 1 on this story.

Don your asbestos underoos and get out your winnies and your marshmallows, folks, it's going to be fun thrashing this one out...

Me, I think Junkbuster is a plenty good censorware tool.

Other options (2.50 / 2) (#17)
by madams on Wed May 17, 2000 at 05:50:04 PM EST

An interesting idea, and a thoughtful one as well. However, wouldn't it be better to have a list of allowable sites instead? Apple is doing something similar to this with KidSafe, which is a list of 50,000 or so educator-approved websites (to bad it's only available to MacOS users). IAMAP (I am not a parent), but this is more appealing to me than a list of sites that my child can't see which has to be updated every week.

If you or your kid comes across a site that is not on the list (and thus not accessible), you can review the site to see if it is appropriate. Hopefully a parent would discuss with their kid why a particular site is or is not acceptable, but if this is always the case, who needs filtering software?

--
Mark Adams
"But pay no attention to anonymous charges, for they are a bad precedent and are not worthy of our age." - Trajan's reply to Pliny the Younger, 112 A.D.

Re: Other options (3.00 / 1) (#19)
by DJBongHit on Wed May 17, 2000 at 08:17:37 PM EST

An interesting idea, and a thoughtful one as well. However, wouldn't it be better to have a list of allowable sites instead? Apple is doing something similar to this with KidSafe, which is a list of 50,000 or so educator-approved websites (to bad it's only available to MacOS users). IAMAP (I am not a parent), but this is more appealing to me than a list of sites that my child can't see which has to be updated every week.

If you've ever tried to use Apple's KidSafe, you'd change your opinion quickly. 50,000 sites may sound like a lot, but considering that that's like 0.01% of all pages on the internet (this number is coming from the top of my head, but it's a very small percentage), you'd be hard pressed to find something useful.

Apple handles it pretty well, though, and has a search engine which just searches the sites in the KidSafe database, but it's still obnoxious.

If you or your kid comes across a site that is not on the list (and thus not accessible), you can review the site to see if it is appropriate. Hopefully a parent would discuss with their kid why a particular site is or is not acceptable, but if this is always the case, who needs filtering software?

In my opinion, this is what parents should be doing in the first place. Deciding which sites their kids should be allowed to visit, and supervising them while they're doing it. They shouldn't rely on software which has no chance of possibly working correctly.

LOL - my site would most certainly be among the ones parents shouldn't let their kids visit, though :-)

~DJBongHit

--
GNU GPL: Free as in herpes.

[ Parent ]
Only allowable sites (none / 0) (#22)
by PrettyBoyTim on Thu May 18, 2000 at 04:02:14 AM EST

I can see that you could use the system for 'just allowed sites' as well...

If you made the moderation scale so that it went from -10 to +10, you could then set up your system to only go to sites that had been recommended by other people.

In fact, you could use it as your own 'cool sites' sites system...

[ Parent ]
But then what's the point of the web? (none / 0) (#29)
by error 404 on Thu May 18, 2000 at 03:35:46 PM EST

The value of the net is in participation. Otherwise, it is just low-grade television with too many words.

I can pretty much guarantee, due to the numbers involved, that my 13 year old's site isn't one of the 50,000. So his friends won't see it, and what's the point of his building it?


..................................
Electrical banana is bound to be the very next phase
- Donovan

[ Parent ]
I might be trolling... (4.70 / 3) (#18)
by Field Marshall Stack on Wed May 17, 2000 at 07:08:41 PM EST

...but I'm not certain. That I'm trolling. Something. Anyway, methinks that schemes like this overlook a good portion of the motivation of the people heavily pushing censorware, that being that they're not so much interested in what their kids are viewing as they are in what your kids are viewing.

Ralph Reed et al would most likely not endorse something like this, since it would still give non-fundamentalist parents the option of letting their children view gay rights sites/NOW/ACLU/sex education sites/pro-abortion sites/sites on breast cancer/oh yeah, and I guess porn sites too. But as I said, I might just be trolling.
--
Ben Allen, hiway@speakeasy.org
"Nobody ever lends money to a man with a sense of humor"
-Peter Tork

Recommender Systems (4.00 / 1) (#20)
by mebreathing on Wed May 17, 2000 at 09:23:21 PM EST

The system you're describing is called a "Recommender System".

Let's take music for example. If I report that I give 5 out of 5 on a Likert scale for the Propellerheads and the Chemical Brothers, and you give a 5 out of 5 for Propellerheads, a recommender system can guess that you will probably like the block rockin' beats of the Chemical Brothers as well.

Equally, if you are horribly offended by rotten.com and nastyPoon.com, and I'm offended by nastyPoon.com, I'll probably not like rotten.com either.

This is an oversimplification of how recommender systems work, but you get the idea.

The problem with applying recommender systems to a situation like this is how do you get the user feedback to generate their profile? Sure you can wait until the user runs into something offensive and have them click the "OFFENSIVE" button on their browser's personal toolabar, but this isn't very effective. The goal is to have the user not see any material he's offended by...but to generate a profile, the user has to report that he has been offended over and over. One solution, you could make the user fill out a survey of some sort to generate a "seed" profile. This isn't a complete solution though because somebody still has to be out there being offended.

--
mebreathing
Eric Hanson
http://www.shouldexist.org/


How about this? (none / 0) (#25)
by Noel on Thu May 18, 2000 at 09:16:05 AM EST

Sounds like you're familiar with Recommender Systems. Can you give us a little more information on them, and maybe some examples of systems that work?

Here's a suggested structure for a recommender system that I might be willing to work with. How does it look?

Generate a trust value for each other user/reviewer:

  • Every participant fills out a survey
  • My survey is compared with all the others out there, and a baseline "trust" value is generated for each other user, based on how similar my answers are to the other user's
  • Any time I retrieve a page, it's rated by combining other users' ratings based on their trust value
  • If I decide to view a page even though it's been blocked, or decide that I would prefer to block a page that's been allowed, then that decision is used to update the trust values for any other user that has rated that page.
  • I get to rate each page based solely on its suitability -- not just a go/no go rating, but a scaled rating, maybe 1-5 or so
  • The rating can be either on a page level or a site level
  • My ratings are compared with other users' ratings, and the trust values are updated from that as well

Eventually, this should build up a panel of reviewers that agree with me. Of course, the start-up phase will be difficult, because the pool of people will be small.

There's a lot of other issues that have to be dealt with, though:

  • Privacy: each user's page/site ratings must be kept abolutely anonymous
  • Accountability: any time a site is blocked, the user must be notified that it's being blocked, and what the calculated rating is
  • Freedom: the user must be able to override the blocking and see any site that has been blocked -- although in some situations (e.g., parent/child) the main user should be able to turn this option off for subordinate users
  • Responsibility: in some situations (like parent/child), the main user should be able to turn on logging for each block that has been overridden. Any user whose override is being logged must be notified before they make the choice to override.
  • Performance: this is a lot more work than just comparing to a block list for a go/no go determination. It's got to work without a noticeable slowdown on retrieval

As far as the block/allow process goes, I'm thinking that it's got to be based on a scale. If the calculated rating is extremely negative, the page will be blocked, and I will be notified that it was rejected (with the opportunity to override, of course). If the rating is extremely positive, the page will be allowed. If the rating is within some sort of questionable window, then I'll be given the choice of seeing it or not, and be required to rate it.



[ Parent ]

Accountability (none / 0) (#26)
by PrettyBoyTim on Thu May 18, 2000 at 11:22:36 AM EST

I'm not sure how you'd make it accountable.

As every user might have a different set of blocked sites (due to their moderating tendencies and their setup), I don't really see what comeback a site would have to being blocked. It would simply reflect the feelings of a section of the userbase. Unlike most systems out there, there wouldn't be a canonical list of sites that would merit blocking.

[ Parent ]
Re: Accountability (to whom?) (none / 0) (#27)
by Noel on Thu May 18, 2000 at 12:11:58 PM EST

I was referring the accountability of the software to the user -- that the user should always know what is being blocked and why. I hand't really thought about whether the blocked sites would know that they're being blocked. Would that be useful or just an annoyance? On the one hand, it'd be nice to know when my site is being blocked because of its content. On the other hand, a lot of sites (fanatics, pr0n, etc) wouldn't care one bit.

If the blocked sites' owners do get the information, then it'd have to be either the number of users who tried to access the site, or the number of users who would potentially be blocked if they tried to access the site. This would be another privacy issue, of course -- users should be able to choose whether their access or rating info is reported for inclusion in the totals, and the only aggregates should be available to the site owners.

Hmmm...

[ Parent ]

Re: Accountability (to whom?) (none / 0) (#28)
by PrettyBoyTim on Thu May 18, 2000 at 12:57:35 PM EST

It's quite an interesting point.

Obviously, a user would know when it was being blocked, because they wouldn't be able to get through to it... but I suppose a blocking page with details of it's rating (calculated by the moderation preferences) would be a good idea.

It might also be interesting to have warning pages that pop up for borderline cases - the user can still get to the page, but goes through a warning page first...

Getting a rating for blocked pages would be quite useful for the person being blocked - they would be able to see how close the page was to not being blocked. If it was quite a close thing, they would probably feel more confident in asking the administrator of the filtering system to have a look at moderating that page themselves...

[ Parent ]
Google (1.00 / 1) (#21)
by mattc on Thu May 18, 2000 at 01:55:33 AM EST

I don't think this is important enough to have its own Kuro5hin article, but it is worth mentioning..

Has anyone tried Google's new "safesurf" feature? When you do a search it is a toggle at the top of the results page. Once you turn it on it stays on until you toggle it off again.

I've been using it today and it is excellent!! No more are my legitimate searchs littered with a bunch of porn spam sites!

When I do want to search for porn I just turn it off again. I'm suprised someone hasn't thought of this before!

Freedom, Privacy, Expression etc. (none / 0) (#24)
by Icarus on Thu May 18, 2000 at 06:45:28 AM EST

This is not meant as flamebait, BTW.

I have always wondered about articles and comments such as this. Nearly everyone in the computing community has (seems to have) very set ideas about what they should be entitled to. I have yet to see anyone say "Hell, I don't really mind if there is censorship." since everyone says "Aargghh! If I can't do what I want to do then that is bad."

I wonder if that is a notion born of a) intelligence, b) awareness of surroundings c) rebellion d) bloody mindedness.

The fact that in reality, you CAN'T do anything you want to do (not even to a small degree in places) seems to have been missed by the online community. Well, not missed, it is the fact that the Internet was born out of a military application (or there abouts) followed by academia that gives the online world its "rebellious" viewpoint. Once taken on by the academic community, it had very little chance of being controlled in anyway since each institution could act independently without any central governance. Now that the 'corporation' has got hold of it, the whole thing is going to be sanitised, controlled, packaged and so on. All underground movements that get absorbed into the mainstream go the same way (Clubland, Skateboard, Surfing and so on). It is the very fact that it is a minority thing controlled (to a certain extent) by like minded people that gives something a chance to be rebellious and free, but once the rot sets in, control has (is) to be given up to the mighty dollar/pound/franc/mark etc.

Solution: move on to the next underground thing. Start something new, break new ground, explore other areas. The internet is lost to the corporation/government in many ways and nothing will get it back.

Too pessimistic? Too rambling? Too many ideas to express in one simple post.

Icarus
Natsu-gusa ya, Tsuwamono-domo ga, Yume no-ato. -Matsuo Basho

More thoughts on filtering software | 29 comments (29 topical, 0 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!