INDEX
    Explanations

    phrases indicating leading, guiding, or directing

    phrases indicating causation or influence

    New Auto-Interp
    Negative Logits
    anooga
    -0.68
    TB
    -0.65
     roy
    -0.63
    sle
    -0.62
    checks
    -0.61
    RP
    -0.59
     spawned
    -0.59
     multiplier
    -0.58
    anni
    -0.58
     squeezed
    -0.58
    POSITIVE LOGITS
     believe
    0.87
     conclude
    0.85
     conclusions
    0.81
     Oliv
    0.74
     realize
    0.73
     realise
    0.73
    ãĤ©
    0.73
     pursue
    0.71
     discover
    0.71
    ixel
    0.70
    Act Density 0.164%

    No Known Activations