a glob of nerdishness

January 12, 2008

Bookmarking with del.icio.us: Now I’m a believer.

written by natevw @ 6:08 pm

I’m almost embarrassed to admit it, but I just registered at del.icio.us, the socially/taggy bookmarking site. Why it took so long, I don’t know. Years ago I heard about it but wrote it off as some fad. Why would I want all my bookmarks online anyway?

Since then I’ve switched operating systems (which to my mind happened in the distant past!) and switched browsers and gotten my first modern laptop. I’ve also begun to realize what information overload means, but still don’t want to lose recollection of great sites that might have taken a bit of serendipity to notice in all the noise.

I had already given up using the built in browser bookmark folders. (I do use the bookmarks *toolbar* for more frequent destinations.) I’ve got no idea how to categorize something as chaotic as my web surfing in the first place, and any method I pick just gets so messy. The static nature means that things stay in place long past relevancy, and I’m bad at knowing when that time has come. Now that I’m using both a laptop and a desktop they would never be on the right machine anyway. Not to mention the quandary that came about when I started this blog: “Do I have anything worthwhile to say about this link, or must I keep it to myself?”

Del.icio.us seems to tackle all those problems. When I bookmark a link, I’m also sharing it. By nature of their centralization, I can access them from any machine I’m on. (As a friend pointed out, “the cloud” is a great place for bookmarks since you need web access to use them anyway.) Like a blog, there is a timeline element so that recent bookmarks are up top while older ones can fade away. To top it off, the semi-habitual yet often-random nature of web safari trophies is an excellent application for tagging.

I hope to eventually make this blog format less disgusting. When I find the time to take the ugly out, integrate with the rest of the site and organize to suit my personality — someday, someday — I hope to include this new discovery into the mix. Until then, feel free to follow my bookmarks as I settle in to a new groove.

Visual Basic makes you Dim

written by natevw @ 4:54 pm

It’s been a kinda rough week realizing that Cocoa’s power does not lie in it’s learnability. But I’d learn it any day over Visual Basic.

When learning Objective-C and its joined at the waist framework, I can do a couple quick Google searches and find Apple’s official reference, an informative third party introduction, great guides like this and this, and many other quality references and sample code.

But with Visual Basic, any “answer” I find is glossed over in some script kiddie forum, and seemingly missing (or at least unfindable) in any of Microsoft’s manuals. What is this “Set” keyword? I gather it’s deprecated, but knowing would sure make figuring out old code easier…I tried and tried and tried to find a one-stop reference that would teach me the overall language so I could know what I had to work with. [Emphasis on "had to".] Mostly I just came up with basic “how to make an Excel macro” type stuff. I eventually found a wikibook, but even that left me wanting.

Eventually I did find What Set means in Visual Basic, tucked in an article. But even if all the various Cocoa concepts makes me feel like I’m just playing whack-a-mole, at least I know a) what concepts there are, and b) where to get reliable information on each when I need it.

January 10, 2008

Playing with Mail and Leopard’s Latent Semantic Mapping

written by natevw @ 6:35 pm

While clearing Mail.app’s junk mail folder, I might have accidentally deleted a non-spam message. I’ll never know for sure, but as a result I learned a bit more about Leopard’s cool new Latent Semantic Analysis framework that I’d been wondering about since the mailing list leaked back in November.

Mail stores its spam information in ~/Library/Mail/LSMMap2. In addition to the Latent Semantic Mapping framework, Apple also provides lsm, a command line utility that provides the same functionality (with a little better documentation, I might add). As described in the man page, you can use lsm dump ~/Library/Mail/LSMMap2 to get a list of all the words that Mail’s spam filter knows about. (Some words probably NSFW, of course!) The first column is how many times the word has appeared in a “Not Junk” message, and the second is the count in spam messages. The last line gives a few overall statistics: how many zeroe values there were versus total values, and a “Max Run” value I don’t understand.

Between this and CFStringTokenizer and its language-guessing coolness, Leopard provides some fun tools for playing around with text analysis. Hopefully someday I’ll have a bit more time to dig into it.

Until then (or rather, for my future reference), here’s a bit more information on Latent Semantic Analysis: how it works and how it differs from Bayesian classification.

I’ve also uploaded a really quick and dirty “playground” for testing out the hypotheses the documentation left me with: lsmtest.m

Update: Came across a more “explainy” article about both Mail and Latent Semantic Analysis over on macdevcenter.com.