INDEX
    Explanations

    references to individual or group identities and their interactions in various contexts

    New Auto-Interp
    Negative Logits
     becomes
    -0.22
     begins
    -0.18
     eaten
    -0.17
     dies
    -0.17
     наÑĩинаеÑĤ
    -0.17
     discovers
    -0.17
     emerges
    -0.16
    start
    -0.16
     blir
    -0.16
     learns
    -0.16
    POSITIVE LOGITS
     recently
    0.25
    recent
    0.24
     currently
    0.20
     started
    0.20
     began
    0.20
     established
    0.19
     runs
    0.18
     Runs
    0.18
    runs
    0.18
     Recently
    0.18
    Act Density 0.947%

    No Known Activations