INDEX
    Explanations

    words related to manipulation or influence

    references to the concept of "spin" in various contexts

    New Auto-Interp
    Negative Logits
    avis
    -0.72
     Scores
    -0.68
     Commodore
    -0.67
     Admir
    -0.65
    ecause
    -0.63
    inez
    -0.63
    ablish
    -0.63
     Mellon
    -0.62
    inances
    -0.62
    enance
    -0.62
    POSITIVE LOGITS
    ners
    1.41
     spin
    1.04
    kered
    0.98
    eless
    0.92
    ned
    0.89
     yarn
    0.89
    spin
    0.88
    wheel
    0.85
    ingen
    0.84
    ball
    0.83
    Act Density 0.017%

    No Known Activations