Consumers and businesses are sharing more data on resources and carbon emissions with third parties, the government, each other, etc. Privacy (and expectations of privacy) is a key issue. From a consumer perspective as discussed in the California Progress Report,
What are the unintended consequences of such a system? Personal privacy issues routinely arise when data collected is harmless in isolation, but becomes a threat when combined with other data, or examined by a third party for patterns. A few principles we should keep in mind as we develop a regulatory framework for such a transition will be consumer control, transparency, and accountability.
In particular: How much information should we give up to the grid? Should it be up to the customer to decide? If not, who gets access to that information, for what reason, and what will they be allowed to do with it? How will this information be managed (i.e. how long stored?)? And how well will it be protected from those that might seek it unlawfully? Can it even be fully protected given the increasing success and technical expertise of hackers?
Because technological innovation will only accelerate, we would do well to consider more than simply the immediate privacy threats posed by current technologies, but also what we know to be just around the corner.
For instance, while the tracking of mere energy usage in one’s home may be of less concern, as home devices become increasingly “smarter”, one can easily envision a technology convergence in which a myriad of gadgets could be used to track more sensitive information. Security technology already exists to monitor presence in homes to detect break-ins.
What else will smart appliances “tell” others about what we do, and when we do it, in our homes? [more]
AMEE uses one approach: a random key that is used to identify a particular organization but does not reveal identifying information. However, in some circumstances it is possible to combine collected data with data from third parties to enable re-identification of organizations (see Netflix). I’ve been working with a colleague on a new algorithm for anonymizing statistical databases – changing them in some way to prevent such forms of re-identification while retaining the information contained in the data, such as the average of a variable or the estimated regression coefficient. Simulation results demonstrate that the approach works well in several application contexts (Melville, N., and McQuaid, M., 2010, “Generating Shareable Statistical Databases for Business Use).