Popcorn
Scalable and private media consumption with Popcorn
Trinabh Gupta / Natacha Crooks / Srinath Setty / Lorenzo Alvisi / Michael Walfish
UT Austin / MPI-SWS / Microsoft Research / NYU
Created by Keylor Chaves / @keylorch
MPI-SWS: Max Planck Institute for Software Systems
Scenario
1. Hides it from eavesdroppers and the content distributor itself.
2. Make it affordable, even at scale. They should result in no more than a small multiple of what customers pay to access content today
2. Respect current controls on content dissemination
Private Information Retrieval (PIR)
k servers
Protocols that allow clients (content consumers) to request content from one or more servers (content distributors) without them inferring which items the clients requested.
A PIR protocol is structured around three procedures: Query, Answer, and Decode.
Popcorn Strategy
Combines two types of PIR: ITPIR and CPIR
Requests batching
Offline Work
Information-theoretic PIR (ITPIR)
Multi-server PIR protocols. k > 1.
Each k needs to belong to different administrative domains
Chor-Goldreich-Kushilevitz-Sudan (CGKS) ITPIR Protocol
(CGKS) ITPIR Protocol
k: number of serversn: size of the library
L: Libraryl: size (bits) of each object in L
(CGKS) ITPIR Protocol
k: number of serversn: size of the library
L: Libraryl: size (bits) of each object in L
(CGKS) ITPIR Protocol
k: number of serversn: size of the library
L: Libraryl: size (bits) of each object in L
Computational PIR (CPIR)
Commonly constructed using additively homomorphic cryptosystems. k = 1
k: number of serversn: size of the library
L: Libraryl: size (bits) of each object in L
Computational PIR (CPIR)
k: number of serversn: size of the library
L: Libraryl: size (bits) of each object in L
Computational PIR (CPIR)
k: number of serversn: size of the library
L: Libraryl: size (bits) of each object in L
Challenges of using PIR
ITPIR
- Requires multiple non-colluding servers
- All servers combined must compute over n objects to serve one on them on average
CPIR
- Requires expensive cryptographic operations
- The server must load and process all n objects in L
Each query induces O(n) more work
The fastest known CPIR protocols [8] are approximately 10-100× slower than ITPIR protocols
Main issue: they are not able to perform at scale. ITPIR (§2.4) requires multiple non- colluding servers, and thus multiple administrative domains, which means library content would have to disseminate beyond its original distribution channel, in apparent conflict with the requirement of respecting existing controls on dissemination
Popcorn Architecture
each media file is split into segments, and the library is partitioned into columns
A segment is a variable-sized contiguous piece of a media
A column is the union of each of the corresponding segments for the n objects in the library (each is presumed to have the same decomposition into segments—which may require padding objects
Popcorn Architecture
A primary content distributor creates an encrypted version of the library, LEnc, using per-object keys, and replicates LEnc to secondary content distributors, each in separate administrative domains. The primary content distributor maintains a key server, and each secondary content distributor maintains an object server
The key server serves the per-object keys using CPIR, and the object servers deliver encrypted objects using IT- PIR.Content Retrieval Protocol
A client retrieves the object's decryption key from the key server using CPIR
Content Retrieval Protocol
Retrieves the encrypted object (via segments/chunks) from the object servers using ITPIR
Composing ITPIR and CPIR
- CPIR is not a performance bottleneck (small keys)
- Current controls on content protection are respected (m is stored only at the primary content distributor)
Pending: Overhead of ITPIR: uses XOR, but needs to read n/2 segments on average
Batching
- The server can amortize the cost of reading a column over a batch of queries
-
q . L can be replaced by Q . L where Q is a matrix whose rows are query vectors
Specializing batching for media delivery
- A client's initial delay is given only by the time it has to wait before it can download the first segment.
- Other segments are not needed until later and can afford higher processing times.
- We need to choose, for each column, the longest possible processing cycle, such that users perceive a low initial delay d and that the movie is played back smoothly
Popcorn Architecture
each media file is split into segments, and the library is partitioned into columns
A segment is a variable-sized contiguous piece of a media
A column is the union of each of the corresponding segments for the n objects in the library (each is presumed to have the same decomposition into segments—which may require padding objects (§8));
Moving work offline
Limitation: Given that d is small, the batch sizes for these columns are also small, and the per-request I/O is high
- Only qk depends on b
- We can compute the reply to the first k - 1 query vectors offline
Moving work offline
First time Popcorn users: Generate a configurable number m of random query vectors, which are sent to the offline server.
The server precomputes (and stores locally) replies for these query vectors.
Evaluation results
-
The per-request dollar cost in Popcorn is less than three times the per-request dollar cost in a system without privacy.
- 80% of the per-request cost in Popcorn is the cost of transferring data over the network
- Popcorn requires large objects and many concurrent clients to effectively reduce costs.
Popcorn
Scalable and private media consumption with Popcorn
Trinabh Gupta / Natacha Crooks / Srinath Setty / Lorenzo Alvisi / Michael Walfish
UT Austin / MPI-SWS / Microsoft Research / NYU
Created by Keylor Chaves / @keylorch
MPI-SWS: Max Planck Institute for Software Systems
- Updates to the library
- Variable size objects
- Variable object quality and client bandwidth
- More complex pricing models
- Targeted ads and recommendations
Popcorn
Scalable and private media consumption with Popcorn
Trinabh Gupta / Natacha Crooks / Srinath Setty / Lorenzo Alvisi / Michael Walfish
UT Austin / MPI-SWS / Microsoft Research / NYU
Created by Keylor Chaves / @keylorch
MPI-SWS: Max Planck Institute for Software Systems