INDEX
    Explanations

    phrases related to impactful actions or events

    New Auto-Interp
    Negative Logits
     himo
    -0.49
     tev
    -0.47
     mikrofon
    -0.47
    -0.45
    WAL
    -0.44
    encodeWith
    -0.43
     RUS
    -0.43
    DOUT
    -0.43
     Craw
    -0.42
    Craw
    -0.42
    POSITIVE LOGITS
     strike
    1.29
     struck
    1.17
     strikes
    1.16
    strike
    1.10
     Strike
    1.03
     Strikes
    1.03
     striking
    1.02
    Strike
    1.00
     Striking
    0.90
    struck
    0.83
    Act Density 0.071%

    No Known Activations