Storyful logo

Lead Data Scientist

Storyful
Full-time
On-site
Dublin, Ireland

Storyful is an equal opportunity employer

Job Description :

Job Title : Lead Data Scientist
Location: Dublin (Hybrid — 3 days/week in office)

Reporting to: CPTO (Storyful)


About Storyful:

Storyful (a News Corp company) helps organisations discover, verify, and act on information at speed — from breaking news to complex reputation and narrative risk.

We’re building RiskRadar: an AI-native, decision-ready Narrative & Reputational Risk platform that fuses signals from licensed content, social, and broadcast/video into explainable scoring and clear next actions for high-scrutiny brands — including financial services buyers with serious governance expectations.

This is a pivotal hire: the first Data Scientist at Storyful. You’ll set the bar for how we build trustworthy ML/AI that ships.

Role Mission

Build and production is the core data science capabilities behind RiskRadar — specifically:

  • a robust, explainable Reputation / Narrative Risk scoring system, and

  • entity resolution across messy, multi-source datasets (brands, products, execs, organisations, locations, subsidiaries).

Bonus (strategic): help us explore vision + multimodal models to detect manipulated / misleading video for our Newswire verification workflows.

What You’ll Do (Hands-on Responsibilities)

1) Reputation & Narrative Risk Scoring (Explainable + Governed)

  • Design a score framework (overall + sub-scores) that is measurable over time, resistant to gaming, and useful in executive decision-making.
     

  • Build scoring that can answer: “What changed, why, and what evidence supports this?”
     

  • Implement calibration, confidence, and uncertainty so customers understand when to trust the model and when to escalate to humans.
     

  • Set up model governance patterns aligned to model risk management expectations (audit trails, versioning, monitoring, documentation, change control).
     

2) Entity Resolution (Truth Layer Across Sources)

  • Build entity resolution pipelines that unify entities across licensed news, social, and broadcast/video metadata.
     

  • Combine the right approaches: rules where they win, probabilistic matching where it matters, embeddings/LLM-assisted linking where it scales — always with measurable quality.
     

  • Establish golden datasets, error taxonomies, and repeatable evaluation so entity resolution improves continuously.
     

3) Ship ML Like a SaaS Operator (Fast Experiments → Production)

  • Run crisp experiments: hypotheses, baselines, metrics, iteration loops — and kill weak ideas early.
     

  • Partner with Product + Engineering to take ML/AI features to production quickly, safely, and cost-effectively.
     

  • Build evaluation harnesses for ML and LLM components (offline + human review + online measurement).
     

  • Implement production standards: monitoring, drift detection, cost/latency controls, incident playbooks, and quality dashboards.
     

4) Founding DS Leadership (Today: IC; Soon: Build the Team)

  • Establish best practices for experimentation, reproducibility, documentation, and responsible AI.
     

  • Lead cross-functional delivery with Product, Engineering, and AI teams across News Corp.
     

  • Within 12 months: help hire/mentor 1–2 additional DS/ML roles (scope dependent), while staying hands-on.
     

5) Bonus: Video Verification ML (Vision / Multimodal)

  • Prototype approaches for manipulated media detection and video authenticity signals.
     

  • Translate “model output” into a verification workflow with clear confidence and evidence, not black-box answers.

Tech & Working Environment

  • Cloud: AWS
     

  • LLM stack: LangChain (or equivalent patterns), Langfuse for tracing/observability, modern LLM APIs
     

  • Core: Python, SQL, data pipelines, model packaging + CI/CD
     

  • You’ll work closely with our AI Architect, Product, Engineering, and verification experts. 

What You’ll Bring (Requirements)

Must-have

  • 7+ years in applied Data Science / ML, with multiple production deployments in a commercial environment (SaaS strongly preferred).
     

  • Proven experience leading cross-functional teams to deliver production-ready ML/AI (even if you weren’t the people manager).
     

  • Strong grounding in: classification/scoring/ranking, NLP (and/or LLM applications), statistics, evaluation, and experimentation.
     

  • Demonstrated ability to build explainable systems: not just performance, but transparency, evidence, and user trust.
     

  • Experience designing evaluation strategies: labeled datasets, human-in-the-loop review, acceptance thresholds, monitoring and drift.
     

  • Comfortable operating in ambiguity with high ownership and high pace.
     

Strong advantage

  • Deep experience with entity resolution / record linkage at scale (probabilistic matching, embeddings, graph-based approaches).
     

  • Experience building for regulated / high-governance contexts (financial services is a plus): auditability, documentation, controls.
     

  • LLM evaluation and reliability methods (prompt eval, retrieval eval, hallucination mitigation, guardrails).
     

  • Computer vision / multimodal experience, especially around authenticity, manipulation detection, or media forensics.

What Success Looks Like

First 90 days

  • Clear score and entity resolution strategy, baselines, datasets, and metrics agreed with Product/Engineering.
     

  • First production-ready scoring and entity resolution increments shipped behind feature flags.
     

  • Evaluation + monitoring foundations in place (including Langfuse tracing standards for LLM workflows).
     

By 6 months

  • Explainable scoring system with evidence trails, confidence, and drift monitoring live for real users.
     

  • Entity resolution quality improving on a measurable cadence (golden set + error reduction plan).
     

  • Governance pack in place suitable for financial services buyers (documentation, audit trails, change controls).
     

By 12 months

  • Mature experimentation-to-production loop: faster iteration, lower incident rate, clearer model performance visibility.
     

  • You’ve begun mentoring/hiring to expand DS capability while remaining a hands-on technical leader.
     

  • Optional: early vision/multimodal verification prototypes validated with newsroom workflows.

Why This Role is Different

  • You’re not inheriting a mature DS org — you’re founding it.
     

  • You’ll build AI that executives will rely on under pressure — where explainability and governance aren’t “nice to have”, they’re the product.
     

  • You’ll ship. A lot. And you’ll help define what “good” looks like for Storyful’s AI future.
     

Job Category:

Storyful - Product & Technology