Notes from the 8/24/2022 meeting of TFF collaborators

Sparse tensor support in TFF:
- EW - We have Keras models that we want to port to TFF, they have sparse tensors
  - Simply mapping to dense tensors results in unacceptable memory cost and slowness in our use case, so we’re looking to avoid that
- ZG on existing sparse tensor support in TFF
  - Issues mentioned on GitHub mostly related to tf.data.Dataset
  - Mostly works otherwise, but it requires some DIY, particularly w.r.t, aggregations where we can’t just naively do a sparse sum on the triple of constituent tensors, that wouldn’t have the desired outcome
- (question about relative importance)
- EW - this is not blocking for us, but a good performance/resoruce footprint optimization
- ZG - with respect to the GitHub issues, might work around by hiding dataset inside the TFF computation, so it’s not part of the input-output boundary
- KO - clarifying that our “it mostly works” comment refers to the common practice of representing/handling sparse tensors as tuples of dense tensors. Have you tried dealing with sparse as tuples of dense tensors for datasets usage as well?
  - EW - haven’t tried yet
- KO - sparse in this conversation has come up in two places - for model parameters, but also for sparse input data - are both equally important?
  - EW - would ideally have both
- KO - one action item for Ewan to try to work with tuples of dense tensors that represent the constituent parts.
- KO - this still leaves a question about better APIs/helpers for sparse tensor handling, but can unblock this particular use case. Thoughts on the API?
- EW - ideally could this just be transparent (no need to do anything special for sparse by the customer using TFF and it just works)
  - KO, ZG - in some cases, it’s not obvious, e.g., for aggregation - there’s potentially more than one way to aggregate the constituent parts of sparse tensors, a choice ideally to be made by the customer
  - KR - probably having a small family of dedicated “sparse sum” symbols is most actionable
  - KO - perhaps we can start by prototyping the version of sparse sum needed by EW and upstream it to TFF as a generic sparse sum operator to seed this, and build on that (to follow up on this offline - maybe on discord)
  - EW +1
Jeremy’s proposal, continuing from 2 weeks ago:
- TFF Tech Note: Client initiated connections
- (todo for all to review it later as it was just shared shortly before the meeting)
- (Jeremy is presenting)
- JL - proposing the “task store” abstraction for exchanging requests between a “Cloud” and the per-client executors (e.g., in browsers), with the latter pulling tasks from a centralized “task store”. Has something like this been considered in any other context?
- KR - yes, in failure handling scenarios
  - More hairly problems, though - state transfer across executors is difficult, not sure how much carries over to the scenario presented by Jeremy
- HV - can the executors in the leaves be stateless
  - JL - this would make it more like the SysML paper on cross-device
- (question about performance in this scenario, compared to bi-directional streaming in a way that more closely resembles the native TFF protocol)
- JL - ack that there are latency considerations
- bi-directional streaming not supported in some transports, so not always a viable option
- (ran out of time)
- (to be continued in 2 weeks - first point of agenda for the next meeting, Jeremy will join)