An interesting project to further explore the possibilities of AI in journalism is off to a promising start. Anders Thoresson, the project leader, mentions increased use of open data as a good way to improve access to data for the solutions that are being developed.

The use of AI in journalism is often seen as a menace. The fear is that experienced and talented journalists will be replaced by algorithms that churn out texts around the clock. Certainly, the danger of standardisation exists if journalism were to be fully automated, but with the right approach and good solutions, this can be prevented.

On the other hand, AI solutions of various types can make journalists’ work easier and more effective in different ways. This is especially true for investigative journalism. The tedious work of compiling data can be streamlined with the help of different types of AI models, such as machine learning, that can detect anomalies and connections that humans are incapable of recognizing.

This is what the Media & Democracy research and innovation programme is all about. It is described as “a national collaboration platform for media innovation and social research” and is led by Lindholmen Science Park in Gothenburg. Part of this is the Media Industry and AI initiative, which is run alongside AI Sweden and several media groups that promote the use of AI within Sweden. The media groups involved are Sveriges Television, Bonnier News Local and Stampen.

Analysis of invoices

Anders Thoresson, project manager for the Media Industry and AI, has a long background as a journalist, technology reporter and editor, including at Ny Teknik and the podcast Digitalsamtal. He explains that the work is basically about identifying areas where AI can be useful in the media industry. This could include new tools at the reporter level and editorial flows, as well as advertising solutions. For this, a pilot has been launched:

– The pilot is about investigating supplier invoices for municipalities. We want to explore how different types of machine learning can be used for finding things that stand out, such as fraud, says Anders Thoresson.

Analysing supplier ledger information for municipalities is a prime example of the benefits of AI. The need to do so is not least due to criminal activities such as the use of fraudulent invoices. To concretely illustrate the challenge, Anders Thoresson gives the following example:

– I tried to scroll through one month’s worth of a municipality’s supplier ledger information on my computer screen. It took 14 minutes to scroll from the first line of that Excel document to the last.

It goes without saying that it would be difficult for a human being to see connections and deviations in so large a data set. The way to deal with the challenge so far has usually been for reporters to use spreadsheets like Excel. However, it is not an effective enough approach, especially since the quantity of data and information is growing dramatically while the number of investigative journalists is shrinking.

Open data is valuable

There is no doubt that AI can contribute to better investigative journalism. However, as usual with data analytics, there is one problem: access to a uniform type of data. This is particularly evident in the municipalities, where the data comes from. It is often time-consuming and cumbersome for municipalities, who usually already have only limited resources, to compile and distribute data requested by journalists.

This is not just a practical problem. Ultimately, it is a democracy problem if the work required for transparency in the public sector becomes a burden on it.

– I wish there was more access to data. Working with open data leads to more access. Open data is very valuable in many ways, including giving journalists more options to work with. That is why this is a trend that we have reason to be hopeful about, says Anders Thoresson.

Using open data does not only increase the efficiency of organisations such as municipalities and local administrations that are obliged to release data to journalists and others. Standardised open data allows everyone to publish their data uniformly, providing a better and more consistent data structure. This enables many people to manage and analyse large amounts of data. Or to put it simply: standardised data makes it possible to use tools suitable for data, not PDFs.

The popular PDF format suits many types of analysis very poorly. More structured data provides opportunities for new analyses of them. Thoresson cites text analytics as an example of an interesting analytical method, using AI to structure and find patterns in otherwise unstructured text documents. Moreover, text analytics also becomes easier with a consistent and easy-to-use data format.

– There is a growing appreciation of the value of open data, although practical solutions are not yet in place. More people understand that the use of open data allows better monitoring of the data they have and its quality. It increases transparency, says Anders Thoresson.

Better analyses

So far, the pilot is in its early stages:

– We have started to develop relatively simple visualizations, for example to examine who are the largest suppliers to a municipality and what invoices a particular supplier sends.

Today, there is a specification specifically for supplier ledger information written by MetaSolutions, which makes it much easier for municipalities and local authorities to uniformly manage open data. In the long run, only the sky is the limit. If similarly structured data from other municipalities is available, cross-municipal comparisons will be possible. In order to carry out such and other more sophisticated analyses, access to data is necessary.

More broadly, the use of open data will allow journalists to examine Swedish society in a deeper and more comprehensive way than before, despite the fact that the number of journalists is decreasing. Democracy and journalism will jointly benefit.