Notes form the 2/16/2022 meeting of TFF collaborators

  • Participants:

    • Krzysztof Ostrowski (Google)
    • Alex Ingerman (Google)
    • DeWitt Clinton (Google)
    • Boyi Chen (LinkedIn)
    • Souvik Ghosh (LinkedIn)
    • Zheng Li (LinkedIn)
  • [chen] Our current usage, areas of interest for contributions, processes on how to contribute; future development plan

  • [boyi] How we are using FL today

    • Two parts - one is cross-silo
      • Data of our users
      • Legal requirements constrain access to data
      • FL comes handy with 3P data
      • Can leverage data while remaining compliant with regulation
    • On-device FL - interesting, but mostly working on cross-silo
    • A few projects that we could pursue
      • Have been building prototypes
      • TFF comes handy
      • Benchmark FL vs. personalized transfer learning
        • Using clients’ data to train a personalized model for each client vs. transfer learning f, compare
        • Challenges with how FL works
          • Some clients larger than others -> bias
          • Clients contribyting the most are worried about free-riders; clients with least data are worried about not influencing the model enough
        • Scalability challenges
          • Right now for inference (hundreds of M)
          • Training data not that large currently (10s-100sK/silos)
          • Running inference in batch over O(hundreds of M) clients
          • Total data volume as the main challenge
            • Records across all clients
          • Cluster size is limited now, limiting the rate of inference
        • Client = silo that needs to not have the data mingled with other silos. What is the cardinality?
          • Doing experiments now, want to scale to 100s of thousands of silos in the future
        • What is the number you’ve seen for # of TFF clients?
          • On-device: large number of small data silos; x-silo is small number of large datasets
        • How similar are the silos?
          • Schemas are same, but the distribution of data differs a lot across the silos. Unequal participation
      • [K] Are you thinking of TFF for inference as well as training?
        • [B] Right now, use TFF for training; would prefer to train and inference on the same framework.
        • [K] Same infra or same models?
        • [b} right now, same model and same cluster
      • [B] Want to understand how to train models and deploy to devices.
      • [S] The need to train models in one environment, take out and use in another environment is important. Just not with first application.
  • [B] What we want to build:

    • One idea for contribution, once we do benchmarks on fairness, we can add tools and benchmarks into TFF
      • How the model does across silos (unequal performance and bias)
    • [K] Do you see it as problem in practice? [B] We believe it will be a problem in practice.
    • [B] Think about this from an adversarial perspective. People will be concerned about putting data into the box. Its a general concern but we dont have a particular metric.
    • [K] Which thing are we addressing? Are you talking about situation where there are silos + reguialtions about how to process it - but its not adversarial, you just dont want to create bias. Vs. another situation where there are multiple institutions, mutually distrusting parties. Are we thinking about one or both of these?
    • [B] We want to look at both; right now only think about the latter.
    • [D] e.g. silo here are companies, and datasets are data uploaded by each
    • [K] You are highlighting concerns about freeloading. But there are also mutually distrusting parties. Do the parties want to prevent others/youy from seeing the data? These concerns are in tension. On one hand want to verify contribution to prevent attacks, on the other dont want to see contents, for privacy
    • [B] Look at it in 2 ways. One is privacy preserving - through DP etc. Other part, from model performance perspective, when trained from data of many silos, there is a concern that different silos benefit differently. We think there is a standard way to approach the former; the latter is more tricky.
    • [K] Fairness in the sense that model performs well; other one can be freeloading. Its the latter that is more at tension with privacy. Are you concerned about it?
    • [B] Both are equally important. Want to both protect data privacy and have a fair way to distribute the benefits.
    • [S] We dont have good answers yet. [K] Same.
    • [D] How much do these companies trust linkedin to operate this?
    • [S] Trust hasnt been an issue so far, at least in examples I am aware of. We’ve had some constraint requests, but no flat out refusals. People are willing to share the data for us to build common value.
    • [A] Concern about privacy of just silos, or individuals within silos?
    • [S] The latter
  • [D] Is this being built on Azure? Other deployment things we need to think about?

    • [S] Eventually GPUs will come in; initial models will be smaller and have fewer needs. Eventually, this will involve large number of members and enterprises → models will grow fairly large.
    • [D] Is this the same azure that is publicly available? Or some internal infra to target, that isnt visible outside.
    • [S] Pretty standard stuff.
    • [D] Makes it easier to collaborate, makes the OSS code more valuable since everyone can run it on public azure.
  • [K] Let’s make things! What should these be? We mentioned benchmark suite, and cross-silo platform. WDYT about fleshing out a PRD in the public, talk about features and use cases?

    • [Z] What does the product spec look like? Small components in TFF?
    • [k] we could be talking about components, or a product that can be built on top of tff and be available to others.
    • [Z] I want to understand - is this the contribution process? Start with product?
    • [k] we are making the process here. Depends on where you feel comfortable.
    • [Z] Do you have examples of such products, maybe outside TFF but in TF.
    • [K] TF has a process for design docs. We can begin transforming these notes into something like that. E.g. silos, mutually distrusting, want to use techniques like DP, needs to work on Azure
    • [D] Having a directory of use cases is helpful, without revealing information
    • [K] We want to develop a roadmap, docs, examples of use cases that will exist in TFF anyway, we can begin together. If starting small is easier, by all means lets do this.
    • [B] I see a lot of research about challenges in FL. Maybe we can take a few tools to address these chal;lenges and start there. E.g. similar to free-riding, data heterogeneity - seems common challenge in federated settings. Tools will be useful universally.
      • [K] Tools to evaluate challenges? Or components of system.
      • [B] Functionality that TFF can provide
      • [K] +1. Starting with PRD gives context for talking about features, but we can also talk about features in isolation. Maybe we can start with doc that describes freeloading challenge and works towards tools to deal with.
      • [D] We also work with researchers. Is LinkedIn aiming toi generate research outputs in addition to product?
      • [Z] In the short term, not yet for research.
  • [K] Sounds like we can start with a few shared docs, begin to describe some features or components? Either party can initiate. We can use google docs and email. Lets default to in-public.

  • [ostrowski] What we’d like to build, and what concrete first steps we can take

    • Aiming for more than another meeting - AIs for ourselves?
    • We have started describing a few specific products / projects
      • Benchmark suite
      • Cross-silo platform with DP, fairness, free-loading protecitons
    • Possible next steps
      • Start a product requirements doc and flesh it out openly together for each of the above?
      • Start exchanging design-level ideas?
      • Potential plans for actual development contributions?
        • Specific components / features that you’d like to develop?
    • Specific artifacts to create:
      • Shared doc that describes freeloading problem and requirements of a tool or feature in TFF that could address it
      • Shared doc that describes the benchmarks for bias across silos with unequal amounts of data, what we would like the benchmark to measure
      • Shared doc that defines a new component that would enable TFF to function in Azure-based environment (TBD which layer it would need to integrate with)
  • [ostrowski] Communicating openly

    • What to make publicly available (on the GitHub landing page)
    • Summary of discussions and decisions from this and follow-up meetings to be made available within a few days after each meeting on th GitHub page
    • Links to artifacts (any plans, roadmaps, design docs, etc. to be created) likewise to be published on GitHub
    • Conversations (chat?)
      • Slack
    • Shared goals:
      • Specific products / components in scope?
      • Charter for a more specific / narrowly scoped working group to support the development of these?
  • [B] What to do for small, operational issues?

    • [K] Slack or GitHub issues could work. What would be productive for you?
  • [ostrowski] Recurrent meeting schedule we can jointly commit to?

    • Montlhy