INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    athi
    -0.71
    STE
    -0.70
     VIDEOS
    -0.69
    MENTS
    -0.69
    Behind
    -0.68
    Bas
    -0.63
    nor
    -0.63
    grim
    -0.60
    BLE
    -0.59
    BOX
    -0.58
    POSITIVE LOGITS
     translates
    1.06
     resulted
    1.05
     incidentally
    1.04
     includes
    1.03
     comprises
    1.02
     consists
    1.00
     culminated
    0.97
     consisted
    0.95
     brings
    0.94
     prompts
    0.93
    Act Density 0.477%

    No Known Activations