INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    kkelen
    -0.59
    катерина
    -0.58
     douceur
    -0.57
     пунктов
    -0.54
     <<<<<<<<<<<<<<
    -0.53
    COGN
    -0.52
     Cristóbal
    -0.51
    theless
    -0.51
     Antrieb
    -0.51
     ogrodow
    -0.51
    POSITIVE LOGITS
     die
    4.45
    Die
    3.67
     Die
    3.65
     DIE
    3.54
    die
    3.47
     dies
    3.09
    DIE
    2.96
     died
    2.53
     Dies
    2.36
     dying
    2.32
    Act Density 0.047%

    No Known Activations