Press "Enter" to skip to content

How does News Mash work?

News Mash is software written in Python which runs inside a docker container. The stories it generates are published to a WordPress website via REST API calls. Every thirty minutes or so, it runs through a loop to collect stories from various news sources. It then analyzes each story using natural language processing to figure out what the story is about, and it compares each story to all the other stories that were also published around the same time. When it finds two stories which are similar enough to conclude they are about the same subjects and events, the software combines the content of the two stories, and then, using additional NLP strategies, extracts only the most important parts of the combined text into a new summary.

This summary is then re-written into a new article (this is an important point; the stories it generates are not copies or “spun” versions of any single source, they are new, novel content). With a little more processing, we get named entities out of the new story to use as WordPress tags, and a headline is generated automatically.

Once everything is ready, News Mash posts it to a website using WordPress’ REST API. A tweet is sent out automatically via automation software in WordPress. The python application goes to sleep for half an hour, and then wakes up and does it all again.

That is something of an oversimplification of the actual algorithm, but it’s more or less correct. 

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *