How much information to collect from users

I’m starting to roll out a “check online for updates” feature for my various applications. So far, it’s implemented for TagAssist and Count Anything, and I’m gradually adding it to my other applications as I upgrade them.

I thus think that this is a good time to review my policy on collecting information from users. Right now, I use Google Analytics to track visitors to my site: things like what pages they visit, how long they stay on the site, what files they download, and what keywords they used to find the site. I don’t track individual users, but only trends to help me improve the site (like, a lot of visitors are searching for the keyword “PowerPoint”; I ought to add some content about translating PowerPoint files). I also never share this information with third parties (this is basically detailed in this site’s privacy policy).

I think that it’s pretty reasonable to collect this information, especially because I don’t track any individuals. At any rate, almost all the information I get through Google Analytics (and more) would be available from my Apache log files anyway.

But what about checking online? Even if Felix doesn’t send any information, the mere fact of connecting to my server tells me that somebody is running my software, and from the user’s IP address, I could tell a lot more (like link that IP address to the IP addresses of people who have downloaded the software — presto, download-to-install ratio).

So collecting that information could be useful to me, and it doesn’t violate my privacy policy. Even so, I’ve decided not to do it, because my users are checking online for updates: they’re not connecting to my server in order to feed me statistics, and I don’t think it’s reasonable for them to expect that.

Some other software makers are quite strident about “capturing” user information. Many will force you to give an email address before even allowing you to download their software, or make you contact them in order to get a price. They call people like me foolish to not grab every “lead” I can. I strongly suspect that most such companies are run by graduates of marketing or business programs, and not software developers.

But to me, it’s not about what you can do, or what will earn you the most money in the short term, or even what you can get away with. I prefer to be as open and transparent about my activities as possible, and if some action strikes me as sleazy or shady, I’d rather just avoid it.