SciRise Logo
    How We Validate Our Matching Model

    Rigorous holdout testing on real-world funding data

    The Challenge

    Most foundation matching tools rely on keyword or semantic search to connect organizations with potential funders. These approaches match based on what foundations and nonprofits say about themselves. SciRise takes a fundamentally different approach: we analyze actual funding relationships to predict where money will flow next.

    But how do you know if a prediction model actually works? You test it on real data it has never seen before.

    How We Test

    We trained our model on historical IRS 990 and 990-PF filing data, then tested whether it could predict new funding relationships that actually appeared in a future year the model never saw. Critically, we only count genuinely new relationships — if a foundation already funded an organization in the training period, a repeat grant doesn't count. The model has to identify funders the organization has never received money from before.

    Validation Design

    • Training data: Multiple years of IRS 990/990-PF filings covering 4.8M+ grant relationships
    • Test data: A future year, never seen during model development
    • Success metric: Did at least one of the model's top 10 recommendations appear as a new funder in the test year?
    • Robustness: Validated across multiple independent test periods with consistent results

    Results

    Our network model achieved a 30% hit rate in the top 10 recommendations. That means for nearly 1 in 3 nonprofits, at least one of the model's top 10 suggested foundations actually became a new funder in the following year.

    30%

    Hit rate in top 10

    12x

    More predictive than AI matching

    p < 0.0001

    Statistical significance

    By comparison, the best AI mission-matching and keyword models performed far worse on the same test. Our model is 12x more predictive, and the difference is statistically significant at p < 0.0001. The baseline uses state-of-the-art AI text embeddings to match foundation and nonprofit profiles, a stronger test than keyword search or category matching.

    Why This Matters

    A 30% hit rate means the model isn't guessing. When SciRise surfaces a foundation as a top prospect, there's a meaningful, validated probability that foundation will fund an organization like yours. This is the difference between a search tool and a prediction engine.

    What Makes This Rigorous

    • Strict separation: The model never sees the test year during training, preventing data leakage
    • New relationships only: Repeat grants are excluded, so the model must predict genuinely new funding
    • Head-to-head comparison: Every nonprofit is scored by both the network model and the search baseline on the same data, enabling direct comparison
    • Consistent results: Results replicate across multiple independent test periods
    • Full-population reporting: Metrics include nonprofits that received zero new funders, providing unbiased estimates

    Data Foundation

    Our model is built on over 1 million grant relationships extracted from IRS 990 and 990-PF filings, connecting approximately 10,000 foundations to over 180,000 nonprofits. This is not a sample or estimate. Every relationship represents an actual grant reported to the IRS.

    We map every private foundation distributing $1M or more annually, giving SciRise the most comprehensive view of foundation giving available.

    Questions?