INDEX
    Explanations

    instances of the word "stupid" in various contexts

    New Auto-Interp
    Negative Logits
    garh
    -0.48
    Projection
    -0.46
     Bet
    -0.45
    RegressionTest
    -0.44
     cracks
    -0.44
    راق
    -0.44
    reggio
    -0.43
    ↵↵
    -0.42
    Cleared
    -0.42
    Crack
    -0.42
    POSITIVE LOGITS
     Stupid
    1.15
     stupid
    1.14
    Stupid
    1.05
    stupid
    1.03
     fools
    0.97
     foolish
    0.96
     fool
    0.94
     dumb
    0.92
     Accurate
    0.91
     accurate
    0.91
    Act Density 0.057%

    No Known Activations