INDEX
    Explanations

    mentions of numbers at the beginning of phrases, potentially related to rankings or quantities

    expressions of strong opinions or sentiments

    New Auto-Interp
    Negative Logits
    economic
    -0.63
     Government
    -0.63
     Welfare
    -0.62
     welfare
    -0.62
     administr
    -0.61
     preventive
    -0.60
     withdrawing
    -0.60
     lawful
    -0.59
     unlawfully
    -0.59
     Employ
    -0.59
    POSITIVE LOGITS
     cinematic
    0.80
     sequels
    0.76
     Collider
    0.74
     hilar
    0.73
     cameo
    0.73
     teased
    0.73
     soundtrack
    0.72
     laughs
    0.71
     anthology
    0.70
     premie
    0.70
    Act Density 3.160%

    No Known Activations