Thursday 17 March 2022

Developing conversational agents for use in criminal investigations




 Developing conversational agents for use in criminal investigations

SAM HEPENSTAL, Defence Science Technology Laboratory, UK 

LEISHI ZHANG, Middlesex University London, UK now at Canterbury Christ Church University, Kent UK.

NEESHA KODAGODA, Middlesex University London, UK 

B.L. WILLIAM WONG, Middlesex University London, UK 

Year
2021
Journal citation
11 (3-4), pp. 1-35
Publisher
ISSN
2160-6455
Digital Object Identifier (DOI)



The adoption of artificial intelligence (AI) systems in environments that involve high risk and high consequence decision making is severely hampered by critical design issues. These issues include system transparency and brittleness, where transparency relates to (i) the explainability of results and (ii) the ability of a user to inspect and verify system goals and constraints, and brittleness (iii) the ability of a system to adapt to new user demands. Transparency is a particular concern for criminal intelligence analysis, where there are significant ethical and trust issues that arise when algorithmic and system processes are not adequately understood by a user. This prevents adoption of potentially useful technologies in policing environments. In this paper, we present a novel approach to designing a conversational agent (CA) AI system for intelligence analysis that tackles these issues. We discuss the results and implications of three different studies; a Cognitive Task Analysis to understand analyst thinking when retrieving information in an investigation, Emergent Themes Analysis to understand the explanation needs of different system components, and an interactive experiment with a prototype conversational agent. Our prototype conversational agent, named Pan, demonstrates transparency provision and mitigates brittleness by evolving new CA intentions. We encode interactions with the CA with human factors principles for situation recognition and use interactive visual analytics to support analyst reasoning. Our approach enables complex AI systems, such as Pan, to be used in sensitive environments and our research has broader application than the use case discussed.


Also available from https://researchspace.canterbury.ac.uk/8z653/developing-conversational-agents 


References

  1. [n.d.]. DeepPavlov: An Open Source Conversational AI Framework. Retrieved on 12th Jan 2021 from http://deeppavlov.ai/.Google Scholar
  2. [n.d.]. Language Understanding (LUIS). Retrieved on 12th Jan 2021 from https://www.luis.ai/home.Google Scholar
  3. [n.d.]. Learn AI-designing and Architecting Intelligent Agents. Retrieved on 12th Jan 2021 from https://azure.github.io/LearnAI-DesigningandArchitectingIntelligentAgents/.Google Scholar
  4. Simon Andrews, Babak Akhgar, Simeon Yates, Alex Stedmon, and Laurence Hirsch. 2014. Using formal concept analysis to detect and monitor organised crime. Lect. Notes Comput. Sci. 8132. DOI:https://doi.org/10.1007/978-3-642-40769-7_11Google Scholar
  5. W. Ross Ashby. 1991. Requisite variety and its implications for the control of complex systems. In Facets of Systems Science. Springer US, Boston, MA, 405–417. DOI:https://doi.org/10.1007/978-1-4899-0718-9_28Google Scholar
  6. Rajeev Bhattacharya, Timothy M. Devinney, and Madan M. Pillutla. 1998. A formal model of trust based on outcomes. Acad. Manag. Rev. 23, 3 (1998), 459–472. Retrieved from http://www.jstor.org/stable/259289.Google ScholarCross Ref
  7. Ann Blandford and B. L. William Wong. 2004. Situation awareness in emergency medical dispatch. Int. J. Hum.-comput. Stud. 61, 4 (2004), 421–452. DOI:https://doi.org/10.1016/j.ijhcs.2003.12.012Google ScholarDigital Library
  8. Stuart K. Card, Allen Newell, and Thomas P. Moran. 1983. The Psychology of Human-computer Interaction. L. Erlbaum Associates Inc., Hillsdale, NJ. Google ScholarDigital Library
  9. Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. 2017. Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 670–680. Retrieved on 12th Jan 2021 from https://www.aclweb.org/anthology/D17-1070.Google ScholarCross Ref
  10. Hannah Couchman. 2019. Policing by machine: Predictive policing and the threat to our rights. Liberty. https://www.libertyhumanrights.org.uk/issue/policing-by-machine/.Google Scholar
  11. Neta Ezer, Sylvain Bruni, Yang Cai, Sam Hepenstal, Chris Miller, and Dylan Schmorrow. 2019. Trust engineering for human-AI teams. In Proceedings of the Human Factors and Ergonomics Society Meeting.Google ScholarCross Ref
  12. Gemma C. Garriga. 2017. Formal Concept Analysis. Springer US, Boston, MA, 522–523. DOI:https://doi.org/10.1007/978-1-4899-7687-1_316Google Scholar
  13. Matylda Gerber, B. L. William Wong, and Neesha Kodagoda. 2016. How analysts think: Intuition, leap of faith and insight. In Proceedings of the Human Factors and Ergonomics Society Meeting 60 (09 2016), 173–177. DOI:https://doi.org/10.1177/1541931213601039Google ScholarCross Ref
  14. Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining Explanations: An Approach to Evaluating Interpretability of Machine Learning. Retrieved from http://arxiv.org/abs/1806.00069.Google Scholar
  15. Sam Hepenstal, Neesha Kodagoda, Leishi Zhang, Pragya Paudyal, and B. L. William Wong. 2019. Algorithmic transparency of conversational agents. In Proceedings of the Workshop on Algorithmic Transparency in Emerging Technologies co-located with 24th International Conference on Intelligent User Interfaces’19).Google Scholar
  16. Sam Hepenstal, B. L. William Wong, Leishi Zhang, and Neesha Kodagoda. 2019. How analysts think: A preliminary study of human needs and demands for AI- based conversational agents. In Proceedings of the 63rd Human Factors and Ergonomics Society Meeting.Google ScholarCross Ref
  17. Sam Hepenstal, Leishi Zhang, Neesha Kodagoda, and B. L. William Wong. 2020. Pan: Conversational agent for criminal investigations. In Proceedings of the 25th International Conference on Intelligent User Interfaces. ACM, 134–135. DOI:https://doi.org/10.1145/3379336.3381463Google Scholar
  18. Sam Hepenstal, Leishi Zhang, Neesha Kodagoda, and B. L. William Wong. 2020. Providing a foundation for interpretable autonomous agents through elicitation and modelling of criminal investigation pathways. In Proceedings of the 64th Human Factors and Ergonomics Society Meeting.Google Scholar
  19. Sam Hepenstal, Leishi Zhang, Neesha Kodagoda, and B. L. William Wong. 2020. What are you Thinking? Explaining conversation agent responses for criminal investigations. In Proceedings of the Workshop on Explainable Smart Systems for Algorithmic Transparency in Emerging Technologies co-located with 25th International Conference on Intelligent User Interfaces (IUI’20), Cagliari, Italy, March 17, 2020 (CEUR Workshop Proceedings), Alison Smith-Renner, Styliani Kleanthous, Brian Lim, Tsvi Kuflik, Simone Stumpf, Jahna Otterbacher, Advait Sarkar, Casey Dugan, and Avital Shulner Tal (Eds.), Vol. 2582. CEUR-WS.org. Retrieved from http://ceur-ws.org/Vol-2582/paper3.pdf.Google Scholar
  20. Robert R. Hoffman, Gary Klein, and Shane T. Mueller. 2018. Explaining explanation For “Explainable AI”. Proceedings of the Human Factors and Ergonomics Society Meeting 62, 1 (2018), 197–201. DOI:https://doi.org/10.1177/1541931218621047Google Scholar
  21. Suraiya Jabin. 2015. Machine learning methods and applications using formal concept analysis. Int. J. New Technol. Sci. Eng. 2, 3 (2015).Google Scholar
  22. Bret Kinsella. 2018. Amazon Echo Device Sales Break New Records, Alexa Tops Free App Downloads for iOS and Android, and Alexa Down in Europe on Christmas Morning. Retrieved from https://voicebot.ai/2018/12/26/amazon-echo-device-sales-break-new-records-alexa-tops-free-app-downloads-for-ios-and-android-and-alexa-down-in-europe-on-christmas-morning/.Google Scholar
  23. Bret Kinsella. 2019. NPR Study Says 118 Million Smart Speakers Owned by U.S. Adults. Retrieved from https://voicebot.ai/2019/01/07/npr-study-says-118-million-smart-speakers-owned-by-u-s-adults/.Google Scholar
  24. Gary Klein. 1993. A Recognition .rimed Decision (RPD) model of rapid decision making. Decision Mak. Action: Model. Meth. (01 1993). Google Scholar
  25. Gary Klein. 2017. Seeing What Others Don’t. Nicholas Brearley Publishing.Google Scholar
  26. Gary Klein, Roberta Calderwood, and Donald MacGregor. 1989. Critical decision method for eliciting knowledge. IEEE Trans. Syst., Man, Cyber. 19, 3 (May 1989), 462–472. DOI:https://doi.org/10.1109/21.31053Google ScholarCross Ref
  27. Gary Klein, Brian Moon, and Robert Hoffman. 2006. Making sense of sensemaking 2: A macrocognitive model. Intell. Syst. 21 (10 2006), 88–92. DOI:https://doi.org/10.1109/MIS.2006.100Google Scholar
  28. Neesha Kodagoda, B. L. William Wong, and Nawaz Khan. 2009. Cognitive task analysis of low and high literacy users: Experiences in using grounded theory and Emergent Themes Analysis. In Proceedings of the Human Factors and Ergonomics Society Meeting 53 (10 2009), 319–323. DOI:https://doi.org/10.1518/107118109X12524441080821Google ScholarCross Ref
  29. Bongshin Lee, Catherine Plaisant, Cynthia Sims Parr, Jean-Daniel Fekete, and Nathalie Henry. 2006. Task taxonomy for graph visualization. In Proceedings of the AVI Workshop on BEyond Time and Errors: Novel Evaluation Methods for Information Visualization (BELIV’06). ACM, New York, NY, 1–5. DOI:https://doi.org/10.1145/1168149.1168168Google ScholarDigital Library
  30. David Leslie. 2019. Understanding artificial intelligence ethics and safety. CoRR abs/1906.05684 (2019).Google Scholar
  31. Georgios Leventakis and M. R. Haberfeld. 2018. Societal Implications of Community-oriented Policing and Technology. Springer. DOI:https://doi.org/10.1007/978-3-319-89297-9Google Scholar
  32. Zachary Chase Lipton. 2016. The mythos of model interpretability. CoRR abs/1606.03490 (2016).Google Scholar
  33. Bernard Marr. 2014. Dear IKEA: Your customer service is terrible. LinkedIn. Retrieved from www.linkedin.com/pulse/20140325060328-64875646-dear-ikea-your-customer-service-is-terrible.Google Scholar
  34. Michael F. McTear. 2002. Spoken dialogue technology: Enabling the conversational user interface. ACM Comput. Surv. 34, 1 (Mar. 2002), 90–169. DOI:https://doi.org/10.1145/505282.505285Google ScholarDigital Library
  35. Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in pre-training distributed word representations. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’18).Google Scholar
  36. Christoph Molnar. 2019. Interpretable Machine Learning. A Guide for Making Black Box Models Explainable (1st ed.). Lulu.Google Scholar
  37. Donald Norman. 1983. Design rules based on analyses of human error.Commun. ACM 26 (04 1983), 254–258. DOI:https://doi.org/10.1145/2163.358092Google Scholar
  38. Nancy Pennington and Reid Hastie. 1992. Explaining the evidence: Tests of the story model for juror decision making. J. Personal. Soc. Psychol.ogy 62 (02 1992), 189–206. DOI:https://doi.org/10.1037/0022-3514.62.2.189Google Scholar
  39. Alun Preece, William Webberley, David Braines, Erin G. Zaroukian, and Jonathan Z. Bakdash. 2017. Sherlock: Experimental evaluation of a conversational agent for mobile information tasks. IEEE Trans. Hum.-mach. Syst. 47, 6 (Dec. 2017), 1017–1028. DOI:https://doi.org/10.1109/THMS.2017.2700625Google ScholarCross Ref
  40. Eric Prud’hommeaux and Andy Seaborne. 2007. SPARQL Query Language For RDF. Retrieved from https://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/.Google Scholar
  41. Nadeem Qazi, B. L. William Wong, Neesha Kodagoda, and Rick Adderley. 2016. Associative search through formal concept analysis in criminal intelligence analysis. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics. 001917–001922. DOI:https://doi.org/10.1109/SMC.2016.7844519Google ScholarDigital Library
  42. Nicole M. Radziwill and Morgan C. Benton. 2017. Evaluating quality of chatbots and intelligent conversational agents. CoRR abs/1704.04579 (2017).Google Scholar
  43. Dominik Sacha, Andreas Stoffel, Florian Stoffel, Bum C. Kwon, Geoffrey Ellis, and Daniel A. Keim. 2014. Knowledge generation model for visual analytics. IEEE Trans. Vis. Comput. Graph. 20, 12 (2014), 1604–1613.Google ScholarCross Ref
  44. Kristin E. Schaefer, Jessie Y. C. Chen, James L. Szalma, and P. A. Hancock. 2016. A meta-analysis of factors influencing the development of trust in automation: Implications for understanding autonomy in future systems. Hum. Factors 58, 3 (2016), 377–400. DOI:https://doi.org/10.1177/0018720816634228Google Scholar
  45. Ryan Schuetzler, Mark Grimes, and Justin Giboney. 2019. The effect of conversational agent skill on user behavior during deception. Comput. Hum. Behav.97 (2019), 250–259.Google ScholarDigital Library
  46. Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. 2019. What makes a good conversation? How controllable attributes affect human judgments. CoRR abs/1902.08654 (2019).Google Scholar
  47. Danny Shaw. 2019. Crime solving rates “woefully low,” Met Police Commissioner says. BBC. Retrieved from https://www.bbc.co.uk/news/uk-48780585.Google Scholar
  48. Aaron Springer and Steve Whittaker. 2019. Progressive disclosure: Empirically motivated approaches to designing effective transparency. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI’19). ACM, New York, NY, 107–120. DOI:https://doi.org/10.1145/3301275.3302322Google ScholarDigital Library
  49. James J. Thomas and Kristin A. Cook.2005. Illuminating the Path: The Research and Development Agenda for Visual Analytics. Pacific Northwest National Laboratory, Richland, WA.Google Scholar
  50. Stephen E. Toulmin. 1958. The Uses of Argument. Cambridge University Press.Google Scholar
  51. Jane Wakefield. 2016. Would you want to talk to a machine?BBC. Retrieved from https://www.bbc.co.uk/news/technology-36225980.Google Scholar
  52. Christine T. Wolf. 2019. Explainability scenarios: Towards scenario-based XAI design. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI’19). ACM, New York, NY, 252–257. DOI:https://doi.org/10.1145/3301275.3302317Google ScholarDigital Library
  53. B. L. William Wong. 2003. Critical Decision Method Data Analysis. Lawrence Erlbaum Associates, 327–346.Google Scholar
  54. B. L. William Wong and Ann Blandford. 2002. Analysing ambulance dispatcher decision making: Trialing emergent themes analysis. In Proceedings of the HF2002 Human Factors Conference Design for the Whole Person - Integrating Physical, Cognitive and Social Aspects: A Joint Conference of the Ergonomics Society of Australia (ESA) and the Computer Human Interaction Special Interest Group..Google Scholar
  55. B. L. William Wong and Neesha Kodagoda. 2016. How analysts think: Anchoring, laddering and associations. In Proceedings of the Human Factors and Ergonomics Society Meeting 60 (09 2016), 178–182. DOI:https://doi.org/10.1177/1541931213601040Google Scholar
  56. Serhiy A. Yevtushenko. 2000. System of data analysis “Concept Explorer.” (In Russian). In Proceedings of the 7th National Conference on Artificial Intelligence,. 127–134.Google Scholar
  57. Xiaoyu Yin, Dagmar Gromann, and Sebastian Rudolph. 2019. Neural Machine Translating from Natural Language to SPARQL. Retrieved from http://arxiv.org/abs/1906.09302.Google Scholar
  58. Michelle X. Zhou. 2019. Getting virtually personal: Making responsible and empathetic “Her” for everyone. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI’19). ACM, New York, NY, i–i. DOI:https://doi.org/10.1145/3301275.3308445Google ScholarDigital Library



No comments:

Post a Comment

Trustworthy Insights: A Novel Multi-Tier Explainable framework for ambient assisted living

  Trustworthy Insights: A Novel Multi-Tier Explainable framework for ambient assisted living Kasirajan, M., Azhar, H. and Turner, S. 2023.  ...