STRM @ Data Council Austin: data streams for the privacy age

Consumers around the world ask for better data privacy. Your legal team needs you to fix data privacy yesterday. You just want to build better data products. But how do you make sure you can actually use your data among all those demands?

Viewing time: about 15 minutes.

In March, we attended Data Council in Austin and delivered a lightning talk on building Privacy by Design data systems.

We share a highly condensed version of the thinking behind our IEEE Paper on Privacy by Design and did a quick demo that uses a clickstream dataset from Kaggle.

Data in, privacy out

In the demo we replay that data in a simulator and define a privacy stream (our concept of a data pipeline that receives raw data, and outputs a data stream that is filtered and transformed according to a data contract). We use that privacy stream -which is anonymized in real-time- to compute pairs of pages visited by users in the set, visualized in a force-directed graph.

Utility + privacy

The demo shows how you can retain data utility without sacrificing privacy. As we point out in the talk, this is the kind of pipeline that could feed the feature computation for advanced personalization (like a next-best action model) in sensitive domains. It can also enable providing this kind of advanced (and valuable!) customer experience to all your users regardless of data usage consent - it’s fully anonymous after all.

The video has just been posted by Data Council. Microwave some popcorn and enjoy!

We hope you enjoy the talk, and reach out if you have any questions!

