Notes from the 8/24/2022 meeting of TFF collaborators

  • Sparse tensor support in TFF:
    • EW - We have Keras models that we want to port to TFF, they have sparse tensors
      • Simply mapping to dense tensors results in unacceptable memory cost and slowness in our use case, so we’re looking to avoid that
    • ZG on existing sparse tensor support in TFF
      • Issues mentioned on GitHub mostly related to tf.data.Dataset
      • Mostly works otherwise, but it requires some DIY, particularly w.r.t, aggregations where we can’t just naively do a sparse sum on the triple of constituent tensors, that wouldn’t have the desired outcome
    • (question about relative importance)
    • EW - this is not blocking for us, but a good performance/resoruce footprint optimization
    • ZG - with respect to the GitHub issues, might work around by hiding dataset inside the TFF computation, so it’s not part of the input-output boundary
    • KO - clarifying that our “it mostly works” comment refers to the common practice of representing/handling sparse tensors as tuples of dense tensors. Have you tried dealing with sparse as tuples of dense tensors for datasets usage as well?
      • EW - haven’t tried yet
    • KO - sparse in this conversation has come up in two places - for model parameters, but also for sparse input data - are both equally important?
      • EW - would ideally have both
    • KO - one action item for Ewan to try to work with tuples of dense tensors that represent the constituent parts.
    • KO - this still leaves a question about better APIs/helpers for sparse tensor handling, but can unblock this particular use case. Thoughts on the API?
    • EW - ideally could this just be transparent (no need to do anything special for sparse by the customer using TFF and it just works)
      • KO, ZG - in some cases, it’s not obvious, e.g., for aggregation - there’s potentially more than one way to aggregate the constituent parts of sparse tensors, a choice ideally to be made by the customer
      • KR - probably having a small family of dedicated “sparse sum” symbols is most actionable
      • KO - perhaps we can start by prototyping the version of sparse sum needed by EW and upstream it to TFF as a generic sparse sum operator to seed this, and build on that (to follow up on this offline - maybe on discord)
      • EW +1
  • Jeremy’s proposal, continuing from 2 weeks ago:
    • TFF Tech Note: Client initiated connections
    • (todo for all to review it later as it was just shared shortly before the meeting)
    • (Jeremy is presenting)
    • JL - proposing the “task store” abstraction for exchanging requests between a “Cloud” and the per-client executors (e.g., in browsers), with the latter pulling tasks from a centralized “task store”. Has something like this been considered in any other context?
    • KR - yes, in failure handling scenarios
      • More hairly problems, though - state transfer across executors is difficult, not sure how much carries over to the scenario presented by Jeremy
    • HV - can the executors in the leaves be stateless
      • JL - this would make it more like the SysML paper on cross-device
    • (question about performance in this scenario, compared to bi-directional streaming in a way that more closely resembles the native TFF protocol)
    • JL - ack that there are latency considerations
    • bi-directional streaming not supported in some transports, so not always a viable option
    • (ran out of time)
    • (to be continued in 2 weeks - first point of agenda for the next meeting, Jeremy will join)