ImpChat, Our Story

08/08/2021 ─ UNIHACK Group 7

Inspiration

Have you gone through the ordeal of 14 days in a hotel for quarantine? I have. In this brief time frame, I downloaded Tinder, Tik Tok, Soul, Tantan, and 全民K歌. As you can tell, I was rather desperate for communication with other people - no matter in what shape or form. After reconnecting with my friends after this 14-day-crucible, I found that I was not alone in this need.

However, what I downloaded proved to be nothing more than mere platforms. Taking Tantan for example, I was very much on my own after matching with these people who, by swiping my profile rightward, have obviously declared their interest. I was interested too, but some conversations were harder to start and to continue than others. “Hi!” An hour later, “Hello.” “What are you doing?“ A day later, “watching TV dramas.” I still haven’t figured out the psychology of the person on the other side, but trust me, this type of dry conversation can be stressing. Ultimately, Tantan for me turned into a game of “Jewel of Atlantis”. I would swipe for fun, but if you don’t talk to me, I won’t talk to you.

This can happen between friends, too. When you try to reconnect with a best friend from middle school, it sometimes get tricky. The thing is, you are not sure if you know them as well as you did back then.

So, gospel to all who get socially anxious like me (especially in the post-COVID era), we have developed a new conversation starter. A game that helps you probe deep into how well you know the person: ImpChat, AKA Impression Chat.

What it does

A game in nature, it tries your knowledge of your conversation partner by asking them open, deep questions (not like the trivial ones in the “Commaraderie Tests” you commonly see in your Wechat moments) and asking you to guess their responses. Ultimately, you will get a score based on how close your guesses are to the actual answers. 

If you get a high score, congrats since you just “proved” to know the other person well. If not, congrats since you just got some great conversation starters. Just chat about the questions, right? In a sense, you are reconnecting by predicting.

How we built it

The project uses Python, mainly with Tensorflow and FLASK.

BACKEND The main thing that this project does, put plainly, is text comparison. In the current coding industry, Natural Language Processing (NLP) has become a hot target for machine learning developers and progress has been made using different models based on neural networks. Among these models, our project borrows data from BERT (Bidirectional Encoder Representations from Transformers).

Using BERT’s pretrained language model (based on word correlations), we were able to put words in numbers. The encoder we used converts sentences to vectors of a fixed size, and, using a modified hyperbolic tangent function, we can calculate the closeness between two strings (the sizes do not matter as the vectors are always of the same size) on a scale of 0 to 100, the latter being the highest.

FRONTEND We used HTML and CSS to build different web pages, and we used flask to handle dynamic web content and data transmission. All the frontend functionalities are written in Python. We used Bootstrap framework when constructing the CSS features. We stored the data, which is the answers users entered, in the SQLAlchemy database.

Challenges we ran into & Accomplishments that we’re proud of

As a group of students, some of whom were getting to know each other for the first time, we were like all other groups in that we were not sure of what we wanted to accomplish with this project. When ice-breaking and brainstorming need to happen at one place, things could get messy. Finally, after a lengthy zoom meeting going through all the topics we could possibly focus on, we settled on making a gadget that might help out when conversations go dry or when people run out of topics.

GWH My task in this project was to write the front-end code. The main challenge was that I had never written the front-end code in python until this time, nor had I ever used a python framework. We chose the relatively simple Flask framework rather than Django, but it was still a challenge to learn from scratch in such a short time. After reading various tutorial docs on the official website, checking out a bunch of Youtube videos about Flask, and fixing countless bugs, I was finally able to build a barely decent web application.

LYQ In this project, my duty was to develop the closeness-computing function. I just graduated from high school, and the main challenge I went into was that the field of NLP was so completely new for me that I was dizzied by the various models out there in the open source. Reading papers and Tensorflow documentations, I was finally able to get a firm grasp of how things work. On my part of the story, I am most proud of what I was able to learn in such a short timeframe.

What we learned

GWH This was my first Hackathon, and my first experience with such an intense schedule and open-ended project production. This experience taught me about group work, learning a new skill from scratch, and how to adjust my mindset when facing bugs.

LYQ As elaborated in the last section, technically I learned most about NLP algorithms. However, I also want to stress on what I learned aside from coding. This was my first Hackathon experience, and I have personally never worked with such a short time frame. Nor have I had the pleasure to work with such a committed group of teammates, though. Warmly, I learned to trust my teammates on their innovations, ingenuity, and incentives.

TXC It is my first time attending Hackthon as well. The most important thing I learned is how to collaborate in a team. Unlike programming assignments in university courses, the topic of hackathon is so open ended, and I learned a lot by brainstorming with my teammates together. Also, the project we finished is not something that can be made individually in 36 hours. I realize the great potential of team collaboration, that members take different roles, working efficiently and achieving goals that cannot be achieved by a single person.

What’s next

The next steps of our project focuses on three directions. **Closeness computing. **For the limitations posted by the time frame and devices, we were only able to use the most basic way of computing text similarity and thus our results can be pretty inaccurate. The first thing we will do to improve our algorithm is to actually fetch data and to further train the pretrained model, fine-tuning it to serve our purposes better. **User Functions. **For now our web application doesn’t have any “user” features like registration and login, but since the main function is to let one user guess another user’s answers, it is most appropriate to add the user features in. Another possible feature is letting users design their own question form and answer the questions they want. They could also send their questions to another user so that another user can guess on the same question form.

中文总结及展望：

我们目前使用的计分算法来源于一份官方的BERT训练结果，采用了文本语义分离、距离计算的方式。要想获得更准确科学的语义分析，比如不仅限于同义的词语，还包括承接的语句，可以通过机器学习微调。由于时间关系，只能先放出demo。相关微调工作稍后进行。事实上，经过调研和测试，以pytorch实现的transformer，能良好实现BERT训练结果的应用。比如句子扩写，自动问答，关联度评价。将自然语言进行切割转化，以输入向量的形式传递给transformer的model，根据训练结果在该向量的前后上下文寻找关联向量，以model认定的形式，如generate（扩写所用），输出目标向量。把目标向量以逆变换重归自然语言，转换就完成了。由于暂时对语句关联度和情感的人工智能分析还未上线，采用的是问卷问答的交互方式，以确保多个玩家说话的主题一致。试想，使用更完善的语句分析器，就能在如QQ这样的聊天平台上，对私聊、群聊进行聊天“精彩程度”的评价（目前只有简单的热度、亲密度，而且只是基于对话频率），这对于倡导文明交流、和谐讨论有激励作用，也能在网上拉近人与人间的距离。