a glob of nerdishness

April 28, 2007

A better plan for spam? [nerd version]

written by natevw @ 12:45 pm

I think that small tariffs on “published” messages could stop spam, and also reduce the need for advertising online. Users don’t pay to receive e-mail, but they do pay a little each time they send. Users don’t pay to read web pages, but they do pay a little to comment on them. This wouldn’t change the model of the Internet or Web greatly, but it could significantly change the result.

The need for a better solution

Stopping spam is a difficult problem. It could be tackled at the source and/or the destination. Politicians could criminalize spam, or at least tax it — ideally, stop spam at the source. Why anyone would want more regulation and more taxes is unfathomable to most computer programmers. We prefer to invent clever algorithms instead — ideally, stop spam at it’s destination.

There’s a problem with our approach. Spam isn’t caused by buggy machines, it’s caused by greedy people. If humans have trouble deciding whether an e-mail is in their best interest, how much more a disinterested computer! Even if you believe that greedy people are actually just buggy machines, are you willing to bet the spam battle on your ability to create a superior being? If you have an algorithm that takes text as input and gives a motive as output, there are 33 sets of grieving Hokies still groping for answers. (I’m not sure whether Cringley’s recent “search engine for hate” post is meant as some kind of sick joke to that affect, or what.) There’s no algorithm that can stop sin, except for a sacrificial love which overcame death.

The solution applied to e-mail

Don’t keep up the AI arms race, spammers are clever too. Don’t let the government start skimming more profits, spammers are expert evaders. Instead, let the ISPs charge to accept email on your behalf. This might require modifications to the inter-net email relaying protocols, but with things like DomainKeys and server-based spamboxes we’re doing that anyway.

The solution on the Web

There’s more to gain on today’s Web. Imagine leveraging an OpenID-like (or even OpenID-based) system in a micropayments context. Please don’t check out just yet if you think you’ve heard this all before.

Five years and who knows how many market cycles later, I still agree with most of Jakob Nielsen’s User Payments synopsis. Basically, paid content stinks, but ads and spam stink worse. “Information wants to be free” may remind me of “Arbeit macht frei”, but I still don’t want to pay to read your blog. There’s a lot of pressure keeping the Web from going back to a royalty model, perhaps rightly so. But we can flip micropayments around.

A practical scenario

The Web is currently infatuated with user-generated content, so there are plenty of other extrapolations, but weblogs are an obvious use. A successful blog currently tries to exploit it’s fixed content for money and protect it’s user feedback frome abuse. This is naïve and ugly: income via ads, protection via captchas/logins/AI plugins. Instead, why not install a micropayments plugin? Basically:

  1. The webmaster chooses a username/signature service from any number of providers
  2. These service providers act as brokers, deducting/depositing money into the user’s account
  3. The user’s info/balance is not stored with the service providers, but in an account with a larger entity

Basically, it’s PayPal for Web 2.0 — less centralized, more user friendly. The user doesn’t want to keep his credit card on file with every blog he users, but neither can one company be expected to provide plugins for every blog/forum/link-swamp-service out there. Hence the middle-man services providers. They provide an API specific to a problem-domain, and use a standard protocol to link up with any main account provider the user chooses.

Result: Blogs with active discussion have a reasonable, and practical, alternative to advertisements. Users think just a little bit before they respond. Most importantly, spammers get sick of making blog owners rich.

Potential drawbacks

Users could refuse to accept such a plan, preferring to leave the financial and spam-filtering burden completely on the content providers’ tab. But trading chaotic captchas and commercials for a painless just-charge-it is not really a bad deal. I’m paying about a tenner a month for the privilege of writing on my own site, and I’d gladly support your site with my literal 2¢. This does mean speech for the end-user would no longer be free-as-in-beer, but if adequate attention was paid to privacy concerns there would be no harm done to free speech. If a user doesn’t have a credit card, they can sign up with an account provider that foots the tab in exchange for completing Mechanical Turk-style tasks.

Spammers may not give up so easily, though. If they can’t afford to pay microtariffs on everybody’s blog, the most likely alternative is to instead target only relevant blogs. How this differs from AdSense, I don’t know. The only serious concern left then is security. Spammers must not be able to phish or botnet their way into anyone else’s main account. But this is a concern for countless other scenarios as well, and is by no means an excuse not to try. And if time is money, the time we save by tackling the problem of spam as capitalists rather than technologists will more than make up for any spurious charges encountered as the system is gaining experience.

The plea

Spam is a people problem, as is making a microtariff system acceptable. I am a programmer. If spam was a big interest of mine, I might be immersing myself in Bayesian Filtering and Neural Networks in a technologic attempt to save the world. But this series has been immersing enough, and I have no interest in becoming the PayPal of the New and Amazing Web. Currently this site is low bandwidth and SpamKarma hasn’t swatted too many of my dad’s comments. However, when this site’s readership and “spammer”ship grow beyond what my budget can bear, I would much rather join a privatized microtariff economy than Google’s ginormous AdSense scheme. If you know someone who’s looking for a get-rich-and-save-the-world,like,yesterday scheme, I’d love to hear what they think of this plan. The comments are open and, for as long as practical, still free!

April 21, 2007

Refinance our survey for free smilies

written by natevw @ 6:24 pm

It is unfair to discuss the problem of spam without bringing up its archetype. Spam is the barbarian offspring of advertising, itself an uncouth tradition. Marketing, and often with it some advertising, is important to good business. That said, it’s disgusting how prevalent and tasteless advertising has become.

The relevance of marketing

The best advertisements still serve their victim: Hey, maybe you haven’t heard of us, but we do X which might help you Y. But even the “spend time with your kids, don’t start forest fires” advertisements are still distractions from the main course.(1) As far as living goes, is spam really much less relevant to me than the hundreds of other ads I see online everyday?

In a sense, web advertising is highly relevant to my online life. Without it, countless online destinations would only be hurt by my tourism (the previous post explains). Google gets about 90% of its incredible income (note those figures are in thousands of dollars!) from advertisment. It’s hard to imagine the “google.com” domain name becoming a home for squatters. Given the fortune Google has amassed, I don’t doubt they could keep paying their top-notch engineers long past finding alternate revenue sources, but many other domains would be in dire straits if it weren’t for advertisements.

The need for advertising

Web sites do need to pay their expenses. There’s not really a good reason for CLICK THE SCROLLING FLAMING MONKEY!!!, or the weird dancing business guy(2). In case you want your memory refreshed, Jakob Nielsen has listed some other sketchy advertising techniques that are still pretty common nowadays. As he points out, nothing good comes out of making users annoyed. There is, however, certainly a place for paying the bills through corporate sponsorships, intelligently targeted graphical links, and even computer-placed text ads with all their mechanised “Sense”itivity. I just don’t see an enduring economy for so many sites getting into the ad-supported model, to say nothing of the aesthetics in that situation.

Getting away from advertisements

To be honest, it was primarily my aggravation with advertising that led me to start this series. I think the Web, perhaps even the whole Internet, is ripe for an idea that can reduce the motivation for all forms of advertising: everything from the spam kings’ unfeeling e-mail blitzes to Google’s sentient AdWords revenue scheme. In my next post, I hope to finally elucidate such an idea. We’ll be back…

  1. Although Wikipedia’s eye-opening article ontelevision ads notes that commercials have all but become an essential part of the television experience. I can hardly stand television, but I understand a great many people do. Hopefully this diatribe doesn’t come across as overly cantankerous to my more experienced readers.
  2. Do you get those? I keep seeing job site ads with this white-collar worker who was photographed rolling up his sleeve and in other quasi-professional poses, yet is animated to slowly move his legs as if he were proud that they’re put out of joint.

April 17, 2007

Who pays?

written by natevw @ 7:41 pm

You are paying to read this post. In some way, quite direct if you’re on a cell phone and rather indirect if you’re in a public library, you paid money so that you could view this Web page on your screen. This seems fair enough, especially if you are finding valuable info or entertainment [infotainment?] here. But did you know that I am paying for you to read this post, too? To make matters worse, as I continue to make this website more valuable to more people, the cost for me to give them this content will go up. On the Internet, we pay proportional to how much we’re being served, but also how much we are serving others. It wasn’t always this way.

Last fall, I was watching a Cringley interview with Brewster Kahle, the founder of the Internet Archive and other endeavors. He revealed that just before the Web really started taking off, “in the ’80’s and [early] ’90’s…, AOL would charge people about $6.00 an hour to be on their service, and 10% to 15% of that gross revenue went to the publishers that were making the experiences that the people wanted.”(1) He called this a “royalty model” — just like a book, the more popular a site was, the more money it earned its creators. But eventually, those in power realized that they controlled access for enough “eyeballs” that they could start charging the content providers for the attention they attracted. And so today’s Web model was born: guests pay for access, hosts pay for guests and we need to somehow raise enough money to host over 65 million websites.

In addition to the cost of bandwidth, spam has further raised the price of hosting a website. Why should, say, a game company have to take time from answering user questions to fix the tool they use to do so, just because spammers have moved into the boards? Multiply that by countless others, who each put up a website allowing user-feedback and a month or two later are frantically searching for a solution that will keep the crap at bay. The only reason it is affordable is because every single community-oriented site has a share in the snowballing(2) costs.

There are two problems that face community-oriented endeavors on the Web and on the Internet: how to make money, and how to keep the service from abuse. Typically the former is solved by advertising and the latter by advancing technology inspired by artificial intelligence research. I’ve discussed the trouble with the AI-based technology approach in my last two posts, and I intend to discuss the problems with advertising in my next. After that, we should be all set to look at a solution.

  1. A written transcript is available of the entire interview, which is a good view if you’ve got the time.
  2. Someone thinks of new way to have the computer decide whether a chunk of data is spam or not, hundreds have to figure out how to implement the idea in their respective contexts, and thousands, sometimes millions, in each context have to figure out what’s the best blog comment plugin, what login/captcha system will best serve their guests, and what e-mail program or filtering service won’t junk too many of their friends’ random emails. Add some old-fashioned hand deletion of the ones that still get through, and we’re finally back where we wanted to be, only with an added annoyance or two. There aren’t many people whose time is best spent on this particular problem, but there are plenty who must deal with spam anyway.

April 14, 2007

Fighting evil with things, loving people with robots

written by natevw @ 6:25 pm

Though it’s really not an interest, I’ve been thinking about our response to spam a lot lately ever since I started seeing it as sin in the raw form of the word. It shouldn’t be a surprise, as even many who might laugh at the concept of “sin” describe spam as evil. (The author of SpamKarma, the tool I am currently using to shield you from endless pharmaceuticals, has laughed such laughs.) Thinking of spam as a sinful-human problem instead of a rogue-machine problem can change the way we approach the solution. I have a few more posts prepared related to this particular topic, which in the interest of not overwhelming you with my prolixity, I plan to spread out over the next week or so. Hopefully, talking about spam doesn’t bore you as much as it does me.

I spent the day writing up a solution that I think addresses some of these concerns. Why? I understand computer-based spam filtering has made definite progress. Yet I think AI-based technology will continue to be a battle we can only half win, analogous to many wars the United States has taken on in the last half of a century(1). Guns and computers are both highly inadequate tools when it comes to solving human problems. Yes, we have defended life with weaponry and we have facilitated community with machinery. But it’s not ideal. To think we can discourage spammers with computers is a man-versus-nature story to which most authors(2) would write a different ending.

To believe that technology can save us from spam is just as idolatrous as believing technology can save us from any other consequence of our sinfulness. We’ve relied on technology to do what we can’t for so long that now we are beginning to rely on it to do what we won’t. Our world would rather research and develop a cute and cuddly “Mental Commitment Robot”s than spend time with the sick and the elderly. If the pictures at the bottom of that linked page don’t break our heart as Christians who are supposed to be the power of Christ and the love of God, I don’t know what it will take to wake us from our technicism.

  1. It is not my intent to make light of the current, or any previous, war. I am primarily a programmer. I am not a historian, not a strategist, not God. For the sake of those in the Middle East and for the sake of our troops, I hope and pray that Operation Iraqi Freedom will turn out as just that: political, economic and spiritual freedom. Sadly, I’ve been feeling more and more like one side has gotten us sorely off on the wrong foot, and the other side is intent on not letting any mistakes get fixed regardless of cost. All for what? [Very naughty word] politics. Please consider reminding a soldier of the joyful, wondrous and beautiful sides of the life they fight for, even if you must set aside some cynicism to do so.
  2. Yet it would be a disservice to you as a reader to pretend that everyone accepts the man/nature or the man/machine dichotomy. Much to the contrary. Perhaps this is why technologists get so excited about robots. Feeling they have breathed life into a machine gives them hope that computers can not only solve our homework problems, but also the problems we have on the playground.

April 13, 2007

The wages of sin

written by natevw @ 12:41 am


A recurring scenario in science fiction involves humans making their machines more and more powerful until they are overthrown by them. Well, we are busy filling our online world with better and better Artificial Intelligence — designed to decide what is meaningful and what is not, what is good and what is evil. It seems that not a single open port or a single submittable form on the Internet these days can get away without some sort of AI to determine whether it is being greeted by a friend or foe. As a Christian engineering professor points out, spam is an expected consequence of sin. It should not surprise us that we must struggle with something like rampant spam.

Two approaches

The Internet revolves around two important nouns: bytes and addresses. All around us fly packets of data going from one point to another. The reasons spam is profitable are cheap data and rogue points. It’s efficient to send bytes across the wire and it’s simple to get an IP address. So if we plan to take on spam, then those are the obvious places to focus.


Most bytes aren’t paid for directly. One buys bandwidth — a maximum rate at which bytes can be sent — typically on a subscription basis. How many bytes you get for your buck depends on how close to the limit you feed the pipes(1). A professional spammer buys industrial-strength bandwidth and milks it for all it’s worth. To make spamming less profitable, we could start charging more for bandwidth and the price of each junk e-mail would go up correspondingly.

However, that suggestion has a serious flaw. Spam is outgoing data(2). I think charging for outgoing data is abhorrent. The Internet’s current business model is already terribly skewed *against* the content providers(3). Byte-wise, spam is insignificant compared to what businesses like Download.com, YouTube, Google Image Search and the iTunes Music Store demand. If we raise the prices for spammers, we also raise the prices for non-profits like the Internet Archive, Mozilla, Sourceforge and sponsoring universities, Wikipedia, &c &c. Spammers are getting paid, not hoping for donations!


The other obvious way to discourage spam is to tie an identity to each address. If you can trace the source, you can hold it responsible. This is some people’s worst nightmare, some citizens’ bad dream and some lawyers’ bread and butter. Needless to say, that method has privacy concerns that are beyond the scope of this essay. (Read: it might be a good solution but I’m not going there.)

Further drawbacks.

Both of these solutions would only provide more incentive for another rearing of sin’s ugly head. While some spammers spend their budget on big pipes, others use it to break into other people’s computers and send spam from there. This can be one organization with a fast connection, or a bajillion Internet Explorer users with normal connections. Increasing the cost of the pipes would only encourage more botnet-building research and development against vulnerable computers(4). I’d rather be stuck next to a shady neighbor with a mega-decibel stereo system, than one who has access to my, and all the neighbors’, volume controls!

Both of the obvious solutions have serious drawbacks. Those into politics are busy debating privacy, power and pricing. Those into programming are engaged in a battle of wits; whether to the death, the pain, or the world getting taken over by robots I can’t say just yet. I eagerly await for all things to be made new. But in the meantime, I think there is a way we can discourage spam, and I believe my professor is close to the right idea.

Further exploration

The problem is a double case of wrong perspective. As humans, we think of spammers shipping us barges full of toxic waste. In response, we do our best to implement port security. Humans are discerning creatures, so this might work in real life. But for a computer, telling the difference between toxic waste and the sacks of coffee that get us to work every morning is a hard problem.(5) The second perspective issue is much more subtle. When a barge full of dirty bomb material makes it through our port, we fume and feel victimized. We might even feel hate. We’re mad at the barge, we’re mad at the port it came from. We’re mad at our computer because it’s not competent enough to keep our inbox safe. But here is where the analogy breaks down. Spam is not motivated by hatred posing as zeal. Spam is motivated by greed. And capitalism is all about squeezing something good out of greed. I hope to explain in detail how I think we can exploit the tariff model, as well as exploring a number of side-effects, good and bad.

  1. …and whether said bandwidth is actually available or just some imaginary number that a marketing department made up.
  2. from the spammer’s perspective
  3. The better your content, the more bandwidth you will need to buy. This is just as true for non-profit organizations, and one reason even-over content hosting sites like Flickr, YouTube and Blogger are such good deals for the end-user.
  4. I.e., all of them. Vulnerability is a rank, not a switch that can be turned off.
  5. I suspect Bruce Schneier of having a reductionist view of humanness, thus, his paranoia about our nation’s recent security attempts stems from his incredible knowledge of computer security. Of course, there may reason for concern regarding Motherland Security due to experience with things such as history and human nature!

April 12, 2007

Senior Design

written by hjon @ 10:03 pm

As a manner of explanation for my lack of posts, here is some information about what I’ve been up to.

As part of my engineering degree, I am in the midst of a Senior Design project with two other engineers. We’re working on building an electric vehicle, using a frame from a Senior Design project done 10 years ago (they built a human-powered vehicle, so ours is intended as a next step). The primary purpose behind these vehicles is to reduce pollution caused by taking short in-town trips (tailpipe emissions are worst at a car’s startup, so a lot of short trips in town can be worse, pollution-wise, than a longer trip that allows the engine to warm up and reach its most efficient state). So these vehicles are intended primarily for commuting purposes.

Ok, I think that’s enough explanation for the time-being (if you want more, ask in the comments, and I’ll try to address it in a future post), but here are some pictures of the vehicle that we used for our basic frame.

Here’s a view of the human-powered vehicle before we took it apart (unfortunately, we had already removed the cargo area, so we don’t have any pictures of that):
Human Powered Vehicle (small)

Here’s a second view:
Human Powered Vehicle 2 (small)

Here is a picture of the frame after we took it apart:
Frame taken apart (small)

Finally, a close-up of the steering mechanism used on the wheels so that they tilt and turn:
Steering mechanism (small)

April 7, 2007

License options for those without legal departments

written by natevw @ 2:40 pm

Jeff Atwood has put together a handy chart of software licenses on his great blog. It lists only a few of all the software licenses known to man, but that’s just the point. What makes the chart especially handy is his choice of columns. Succinct “Source” and “License” headings help narrow a choice down, and the “Clauses” column suggests the amount of legalese you’re in for upon further investigation. It’s almost as helpful as the Creative Commons license builder(1), but for software developers.

For helping programmers to share their own code, the three Microsoft licenses (especially the two which have Open Source Initiative-approved cousins) seem out of place. All the same, I present for completeness a similar summary of the Apple Public Source License, version 2:

  • Source: Open
  • License: Permissive / Weak copyleft
  • Clauses: 13 with abundant sub-clauses
  • Gist: allows proprietary use of unmodified code, with patent and source code caveats on modifications.

Obviously not a great choice for new code(2). It’s a license Apple uses to voluntarily release the kernel code for OS X, and even they don’t use it for all their available sources.

  1. …though if I ever come across (or make) a page that puts the extra CC polish on the process, I’ll let you know.
  2. The wildebeest himself approves your use and contributions to APSL-licensed software, but doesn’t “recommend you…release new software using this license”.

April 2, 2007

Assembly primer

written by natevw @ 8:09 pm

I’ve been doing a lot of C++ coding at work lately, and sometimes the only “source” Xcode can show when debugging a compiled library is assembly code. Could knowing assembly language help debug C++ code? Short answer: no. There’s better techniques, and the assembly code often takes me closer to the machine (and the deadline) than I need to be in those instances.

Yet I’ve had a longtime interest in assembly code, for numerous reasons(1). Take this tiny piece of segmentation-faulting assembly:

call *8(%edx)

I had a hunch edx was a processor register (a “variable” of sorts), and sure enough Xcode’s debugger showed a 0 in the EDX register. “Call” then seems to say I’m trying to execute code from a bad function pointer that got written into EDX. But what’s with the “*8″, and does the ‘%’ mean anything special? Enter a great tutorial on AT&T Assembly Syntax, which happens to be what the GNU toolset, and therefore Xcode, uses(2). From that, we see that the percent sign is just a sigil that prefixes a register name.
The “*8(” part is a bit trickier. Under “Memory Addressing”, we see that a memory takes the form of “segment-override:signed-offset(base,index,scale)”. Don’t ask me what all that is, but it seems in our case we can simplify that down to “signed-offset(base)”. Lower down, we see that “Branch addressing using registers or memory operands must be prefixed by a ‘*’”. So it appears that this instruction says: Call the code that is located 8 past the address in the EDX register. Cool!(3)

It’s still on my to-read list, but if assembly language is interesting, you might want to check out Paul A. Carter’s PC Assembly Language free PDF book, geared towards assembly language from C. Let me know how it goes!

  1. …from the days when the processor was closer, and just wanting to know how it worked, as well as wanting to write über-optimized code, modify my GPS’s firmware and make things with embedded processors…with only an assembler and raw coder-manliness! Now that Apple uses x86 on their desktop machines, that’s all the more reason to learn
  2. To fully decode Assembly Language, you’ll also need a mnemonic reference for your architecture. The sig9 tutorial uses IA-32, which is has a good chance of being what you’re using.
  3. Or in our case, not so cool. Since the EDX register contains 0, this would call code at address 0×8, which isn’t our code. Thankfully, the kernel detects an address this messed up and puts a quick stop to the program. However, most of those worrisome “arbitrary code execution” security holes which Microsoft was particularly good at work using similar unexpected-address calls: a cracker finds a way to a) put some machine code into memory, and b) put a “call” into the list of upcoming instructions that will run said machine code.