To offer psychotherapists systematic feedback on individual therapies and to improve the overall quality of psychotherapeutic care in the Czech Republic – this is the aim of the new application named DeePsy, enabling automatic speech processing. The researchers from the BUT Speech@FIT group and their colleagues from Masaryk University are currently working on its development. The application should be completed in June next year.
Psychotherapists often lack feedback that would allow them to continuously evaluate their work. "Psychotherapy is a demanding activity where therapists process a significant amount of information. While they consciously analyse some information, they often process much more unconsciously and intuitively. This can lead them to overlook subtle signals of the client's distress or deterioration. Clients usually solve their own problems rather than evaluate the professionalism of the therapist's performance. Some studies have also revealed a declining level or stagnation in psychotherapeutic qualities over time," explains Pavel Matějka from the BUT group Speech@FIT.
Manual transcription of individual sessions and their subsequent analysis are time consuming. To address this, experts from Masaryk University turned to researchers from FIT BUT, who specialise in automatic processing and information extraction from speech. The test version of the DeePsy app, functioning on deep learning principles, offers psychotherapists automatic session transcriptions and analyses of their content.
Graphs comparing client and therapist speech show who spoke more during the session and the average words spoken per minute. The analysis of key words can reveal the prevailing emotions in the speech and the proportion of verbs formulated in past, present, or future tenses. The app can also evaluate the frequency of the most used words.
"Research studies indicate that when the language of the client and therapist significantly differs, whether in content or form, it can indicate problems in the therapeutic relationship. DeePsy alerts the therapist to such discrepancies. How the therapist handles this information, however, depends on the individual therapist. We are merely providing information," adds Matějka.
To obtain information from speech, FIT researchers use technologies such as automatic speech recognition, natural language processing and machine learning. The neural network algorithm was trained on thousands of hours of audio recordings, ranging from interviews to phone calls to spoken monologues. However, they immediately encountered a challenge: "We found that during therapy sessions, speech is very different from ordinary speech. Clients are usually emotionally distraught, so they often repeat words – perhaps three to five times – before moving on. Creating a meaningful transcript of conversations initially took us much more time," Matějka adds.
The DeePsy application also includes a client questionnaire system, allowing for systematic feedback on working with clients alongside audio recordings. "Next, we will work on evaluating the therapist's interventions. The algorithm should be able to determine whether therapists frequently ask questions, interpret, provide information or give recommendations," says Matějka.
The web application, which is being developed as part of a project of the Czech Technology Agency, is currently being tested by researchers together with therapists at the Psychosomatic Clinic and the Therapeutic Harbour. It is expected to be ready for use in June next year. "We hope that it will provide psychotherapists with user-friendly and valuable feedback that will contribute to the improvement of psychotherapeutic care in the Czech Republic," Matějka concludes.