= !OfficeBot: joint work of NLPC MU and Konica Minolta = student R. Sabol a Slack bot that supports teamwork Project web: https://nlp.fi.muni.cz/projects/officebot/ [[BR]] Project repository: [source: /nlp/projekty/officebot/trac-git] tasks: * detect channel activity * active/non-active (displayed as bold/normal typeface) * type of activity (conversation = several mutually related posts, individual posts) * language of the activity (!Czech/English, short messages/long posts ...) * sentiment of the activity (possible flamewar, emoji-based classification?) * summarize activity for a particular channel * from a particular moment track all messages * cluster conversations (based on time span and/or topic) * detect keywords * highlight most important messages or part of messages * summarize activity for all channels * suggest joining a conversation based on user preferences (favorite topics, languages, sentiment ...) * suggest moving part of a conversation into external applications * transfer to calendar, trello, private channel (talk to myself) data: * publicly available dataset https://github.com/houstondatavis/slack-export * KM data (to be provided after filtering out sensitive information + solving legal issues) language: * English (needed to filter out non-English texts) * probably needed tools for "internet language" (e.g. abbreviation expansion, emojis) == Plan == || Time period || Student Activity || KM Activity || || M1 || \ {{{#!td study the data study Slack functionalities [[ChatbotFunctions|bring ideas about what can be done]] }}} {{{#!td resolve legal issues concerning KM conversations data }}} |---------------- || M2 || \ {{{#!td create the basic chatbot: * listen on different channels * detect channel activity * quantify channel activity }}} {{{#!td run the chatbot on KMLE Slack start collecting feedback }}} |---------------- || M3 || \ {{{#!td study NLP techniques/implementations that can support smart features for the chatbot extend the chatbot with language detection propose a evaluation for the new feature cluster conversations based on time span }}} {{{#!td }}} |---------------- || M4-M5 || \ {{{#!td extend the chatbot with keyword detection using TF-IDF or another "simple" technique study the specifics of the Slack conversations cluster conversations based on keywords }}} {{{#!td }}} |---------------- || M5-M6 || \ {{{#!td study techniques for text summarization extend the chatbot with summarization }}} {{{#!td provide feedback for keyword detection }}} |---------------- || M7 || \ {{{#!td extend the chatbot with sentiment analysis based on words/emojis suggest joining a conversation based on keywords/topics/participants/sentiment... }}} {{{#!td provide feedback for text summarization }}} |---------------- || M4-M8 || \ {{{#!td extend the chatbot with connectors to applications such as Trello, calendar etc. }}} {{{#!td provide feedback for sentiment analysis }}} |---------------- || M9 || \ {{{#!td evaluate and deliver the application }}} {{{#!td }}} **Documentation** Scopes for .history API call: * channels.history: collects history only from SLACK public channels * groups.history: same as with channels, but this one works only for private channels * im.history: collects chat history from private conversations * mpim.history: collects chat history from multi-party conversations * conversations.history: universal scope for all 4 API calls mentioned above, it´s functionality depends on which of those 4 scopes is chatbot allowed to use. For example: if chatbot is allowed to view groups and channels but not allowed to view private messages and multi-party conversations, those API-calls get rejected automatically without limiting other scopes List of potentionally useful information that can be obtained with SLACK API calls: * channels.history: * list of all messages, events * timestamps of these messages * users.info: * name, real name, team name, e-mail * info about user being admin, owner, bot, restriced/ultra-restricted * channels.info: * all members * scope of the channel (private, public, mpim..) * latest message * number of unread messages * topic * previous names Potential ideas: Hard statistics: * using timestamps to compute ratio of messages in certain time interval * using join time/leave time to compute user´s time online over certain period of time * limit channel history checking to certain time interval * limit channel history to certain scope of users (people from same team, admins...) * finding the moment where conversation had most messages for certain time interval Soft statistics: * identify dialogs * statistics for dialogs (length, participants..) * analysis of dialogue - keywords, named entities, topics