GOODABLE SUCCESS STORY

Bringing good news to thousands of readers around the world using AI

The story

When Goodable approached us, they had developed a mobile news app that spread positive news to thousands of readers around the world. We helped develop a real-time news classification engine that scanned daily RSS feeds, new sites, and search crawlers to identify positive news and push it to readers on their platform.

Tech stack

About Goodable

Goodable is combatting mental health by providing readers around the world with positive news. Download the iOS and Android app from the App store today.

The challenge

Identifying positive news amidst a deluge of negative articles poses a significant challenge, particularly with a ratio of 21 negative articles to every positive one. Compounding this issue is the nuanced context in which positive news may appear negative upon closer examination.

To address this, advanced language processing technologies such as XLNet can be employed. XLNet models excel at deciphering intricate text relationships and discerning subtle sentiment cues. This enables more accurate categorization, especially in environments saturated with negative news.

What are XLNet models?

XLNet models are advanced language processing models that utilize transformer architecture. They excel at understanding complex relationships within text and detecting subtle indicators of sentiment.

How often does the algorithm run?

A tracking mechanism is implemented to monitor the precision, recall, and F1-score of model predictions over time. If these metrics fall below a predefined threshold, currently set at 85% for recall, the system automatically initiates model retraining to ensure optimal performance and model recovery.

What is the high level architecture proposed?

The entire workflow is automated using Apache Airflow, consisting of two pipelines:

Inference Pipeline (scheduled daily)

The pipeline orchestrates a comprehensive sequence of data engineering and machine learning tasks, all consolidated within a unified Airflow workflow. Set to activate daily, this pipeline executes a series of operations:

Aggregates news articles from various online platforms and stores them in a PostgreSQL database.

Retrieves the latest model artifacts from an S3 bucket for utilization in predictions.

Utilizes Apache Spark for preprocessing tasks such as data cleaning and transformation.

Performs predictions using the model and saves the results to the backend database for further analysis.

Model Training Pipeline (triggers based on performance decay)

This pipeline is designed to retrain the model when there is a decline in performance, with the threshold set at a reduction in recall value below 85%.

Batch vs Stream system

It is a batch job triggered daily. The approach involves scraping data daily, with the model making predictions on the daily dataset. This decision aligns with business needs, as news is collected throughout the day, culminating in batch predictions by day's end.

How do we monitor the AI after launch to ensure we did a “good” job?

Experts daily review the predictions made by our model, with a particular focus on articles it identifies as "positive" due to their high-risk implications. The comparison between the model's predictions and the expert-reviewed classifications allows us to monitor the recall metric, which emphasizes the importance of accurately identifying positive cases.

This recall score, reflecting the model's ability to minimize missed positive articles, is reported on a daily basis.

The model performed with a recall score of 89% on the production data

Results

+70%

Reduction in fake news, verified by the Goodable Content team

+94%

Accuracy achieved, verified by the Goodable Content team

200,000+

News articles and posts used for training

Interested in learning more?

Successfully submitted.

Oops! Something went wrong while submitting the form.

Relevant Success Stories

Ingage

Bringing good news to thousands of readers around the world using AI

Jump to

Expertise

Visit the website

The story

Tech stack

About Goodable

The challenge

Product & process

How often does the algorithm run?

What is the high level architecture proposed?

Inference Pipeline (scheduled daily)

Model Training Pipeline (triggers based on performance decay)

Batch vs Stream system

How do we monitor the AI after launch to ensure we did a “good” job?

Results

+70%

+94%

200,000+

Interested in learning more?

Opening Motion Calendar...

Successfully submitted.

Relevant Success Stories

What if LinkedIn growth was 10x easier? It is.

AI powered image generation

How Wave is revolutionizing rewards-based shopping

How we helped make speech therapy accessible

Transforming fan travel and boosting conversions with a revamped UX

Developing an ML-first engine to scale pet health monitoring in real-time

Empowering the domain name ecosystem in one place

Venture studio

Ready to accelerate your growth?

Message sent successfully!