Federated Topic-Preference Learning for Knowledge-Grounded Chat with Differential Privacy

Authors

  • Meng-Ju Kuo Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
  • Daren Zheng Information Technology, Carnegie Mellon University, Pittsburgh, USA
  • Julie Hires Computer Science, Dartmouth College, NH, USA

DOI:

https://doi.org/10.51903/jtie.v4i2.502

Keywords:

Federated Learning, Differential Privacy, Knowledge-Grounded Dialogue, Topical Preference

Abstract

Retrieval-augmented approaches have become central in knowledge-grounded dialogue systems, yet incorporating topical preferences remains difficult due to privacy constraints on user interaction data. This study introduces a lightweight federated topic-preference (FedTP) mechanism that models session-level preferences without centralizing raw data and uses client-level differential privacy (DP). Using the Topical-Chat dataset (8,628 conversations), each conversation is treated as a client, and evidence routing is framed as selecting relevant knowledge snippets based on dialogue context. The proposed method augments a TF-IDF relevance score with a small preference-based component derived from both local session distributions and a DP-aggregated global prior. Experimental results on 9,553 grounded test turns show a consistent but limited improvement in evidence hit rate, from 0.6167 to 0.6194. The small optimal preference weight (λ = 0.005) indicates that the preference signal mainly influences decisions when competing candidates have similar relevance scores, rather than substantially altering routing behavior. A privacy–utility analysis under Gaussian DP (ε ranging from 9.69 to 0.606, δ = 1e−5) shows negligible changes in performance, which is expected given the large number of clients in a one-shot aggregation setting. Additional metrics remain largely stable, suggesting that the method affects selection margins rather than overall alignment. These findings suggest that federated preference aggregation can provide a modest, privacy-preserving bias for evidence routing, but its practical impact remains incremental and context-dependent.

References

Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 308–318. https://doi.org/10.1145/2976749.2978318

Arivazhagan, M., Aggarwal, V., Singh, A. K., & Choudhary, S. (2019). Federated Learning with Personalization Layers. arXiv Preprint, arXiv:1912.00818. https://arxiv.org/abs/1912.00818

Dinan, E., Roller, S., Shuster, K., Fan, A., Auli, M., & Weston, J. (2019). Wizard of Wikipedia: Knowledge-Powered Conversational Agents. In International Conference on Learning Representations (ICLR), 2019. https://arxiv.org/abs/1811.01241

Dwork, C., & Roth, A. (2014). The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, 9(3), 211–407. https://doi.org/10.1561/0400000042

Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially Private Federated Learning: A Client Level Perspective. arXiv Preprint, arxiv:1712.07557. https://arxiv.org/abs/1712.07557

Gopalakrishnan, K., Hedayatnia, B., Chen, Q., Gottardi, A., Kwatra, S., Venkatesh, A., Gabriel, R., & Hakkani-Tür, D. (2019). Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations. In Proceedings of Interspeech 2019, 1891–1895. https://doi.org/10.21437/Interspeech.2019-3133

Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M.-W. (2020). REALM: Retrieval-Augmented Language Model Pre-Training. In Proceedings of the 37th International Conference on Machine Learning (ICML). https://arxiv.org/abs/2002.08909

Hard, A., Rao, K., Mathews, R., Beaufays, F., Augenstein, S., Eichner, H., Kiddon, C., & Ramage, D. (2018). Federated Learning for Mobile Keyboard Prediction. arXiv Preprint, arxiv:1811.03604. https://arxiv.org/abs/1811.03604

Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y. J., Madotto, A., & Fung, P. (2023). Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55(12), 1–38. https://doi.org/10.1145/3591421

Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., et al. (2019). Advances and Open Problems in Federated Learning. arXiv Preprint, arxiv:1912.04977. https://arxiv.org/abs/1912.04977

Ladhak, F., Durmus, E., He, H., Cardie, C., & McKeown, K. (2022). Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-Off in Abstractive Summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1410–1421. https://doi.org/10.18653/v1/2022.acl-long.102

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-T., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems (NeurIPS). https://arxiv.org/abs/2005.11401

Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated Optimization in Heterogeneous Networks. In Proceedings of Machine Learning and Systems (MLSys). https://arxiv.org/abs/2003.00295

Lin, C.-Y. (2004). ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out, 74–81. https://doi.org/10.3115/1220355.1220364

Liu, C.-W., Lowe, R., Serban, I. V., Noseworthy, M., Charlin, L., & Pineau, J. (2016). How NOT to Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2122–2132. https://doi.org/10.18653/v1/d16-1230

Nita, E., Tute, K. J., & Sala, E. E. (2025). Web-Based Information System for Student Internship Data Processing Using the Agile Method. Jurnal Ilmiah Sistem Informasi, 4(3), 571–589. https://doi.org/10.51903/7hmb1006

Marfoq, O., Neglia, G., Vidal, R., & Kameni, L. (2022). Personalized Federated Learning Through Local Memorization. In Proceedings of the 39th International Conference on Machine Learning, 15070–15092. https://proceedings.mlr.press/v162/marfoq22a.html

Maynez, J., Narayan, S., Bohnet, B., & McDonald, R. (2020). On Faithfulness and Factuality in Abstractive Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 1906–1919. https://doi.org/10.18653/v1/2020.acl-main.173

McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Agüera y Arcas, B. (2017). Communication-Efficient Learning of Deep Networks From Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 1273–1282. https://proceedings.mlr.press/v54/mcmahan17a.html

McMahan, H. B., Ramage, D., Talwar, K., & Zhang, L. (2018). Learning Differentially Private Recurrent Language Models. In International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1710.06963

Mironov, I. (2017). Rényi Differential Privacy. In 2017 IEEE 30th Computer Security Foundations Symposium (CSF), 263–275. https://doi.org/10.1109/csf.2017.11

Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), 311–318. https://doi.org/10.3115/1073083.1073135

Salton, G., & Buckley, C. (1988). Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management, 24(5), 513–523. https://doi.org/10.1016/0306-4573(88)90021-0

Santi, D., Harahap, R. D., & Siregar, S. U. (2023). Scientific Attitude Analysis on Students in Class XI-IPA at SMA Negeri 2 Bilah Hulu Regarding Human Blood Circulation System Material. Journal of Management and Informatics, 2(1), 01–07. https://jmi.stekom.ac.id/index.php/jmi/article/view/13

Shuster, K., Poff, S., Chen, M., Kiela, D., & Weston, J. (2021). Retrieval Augmentation Reduces Hallucination in Conversation. In Findings of the Association for Computational Linguistics: EMNLP 2021, 3784–3803. https://doi.org/10.18653/v1/2021.findings-emnlp.321

Sudha, S., & Nafeeza, S. (2025). Predicting and Inspecting Food Contamination Using AI-Based Hyperspectral Imaging. Journal of Technology Informatics and Engineering, 4(2), 204–213. https://doi.org/10.51903/jtie.v4i2.266

Vinyals, O., & Le, Q. V. (2015). A Neural Conversational Model. arXiv preprint arXiv:1506.05869. https://arxiv.org/abs/1506.05869

Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., & Weston, J. (2018). Personalizing Dialogue Agents: I Have a Dog, Do You Have Pets Too?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 2204–2213. https://doi.org/10.18653/v1/p18-1196

Downloads

Published

2025-08-25

How to Cite

Federated Topic-Preference Learning for Knowledge-Grounded Chat with Differential Privacy. (2025). Journal of Technology Informatics and Engineering, 4(2), 385-401. https://doi.org/10.51903/jtie.v4i2.502