Do you really need to know?

As a developer I have a large number of side-projects, many of which will never get finished, and recently I have been thinking a lot about the people that use them. More to the point, how I don't care who uses them. It's a strange counter-productive model for an entrepreneur to not want to know their market audience, but I think I draw the line right there, at 'market'. By definition, a market is a place where buyers and sellers can meet to facilitate the exchange or transaction of goods and services and this is all I need to know, that my market exists. I don't need to know who they are, just that they are there and have the potential to benefit from my service.

This doesn't mean you don't need data, it's useful to see what features are most used, error rates and perhaps the most active time period so you know not to schedule downtime during it, and if we're being honest, that's it. Personally, I like to collect as much data as possible about my product, but not the people using it, and I think it's good to define this clear difference. I don't care that John White from the US who is 34 uploaded a new image, but I do care whether that image was uploaded successfully, so that's all I'll check. Debugging is probably slightly harder as I would not save the photo, but error messages are useful and 99% of the time it's probably a logic error or unusual use-case that I didn't plan for and nothing to do with the image itself. For that 1%, I'll see if I can reproduce it, if not, oh well, it's 1%.

What if I have a big project and that 1% error rate correlates to 100 errors a day? In all honesty, if this is the case, it should be easy to reproduce. Don't invade people's privacy for the what-ifs, ask for permission when you need to collect the data.

On a side note, allowing users to find their data is also important, I go as far as to tell them who I hire my server from and the physically location of it. I do this in what I would describe as a very unique way - I point all of my IP addresses to a simple HTML page with a lot of information. An example being on a reverse IP lookup domain 5-39-70-146.ipv4.postal-service.org. It lists what my server is used for, how to access their data stored on it regardless of policies in operation by other websites or services using the server along with other things such as backup retention times and contact information. I even list the number of people who have sudo privileges. If you're asking why I offer the ability for a user to be deleted from a service even if it's not stated in that service's policy, it's because they have a legal right to in a number of locations, and I personally think it's human right.

Hopefully this sparked some thoughts into how you collect data, and how easy you make it for one of your users to not only access their data, but find out how it's used.

Stay safe,

👋 Bailey