INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    {
    -3.31
     Then
    -2.98
     Easily
    -2.75
    9
    -2.64
    3
    -2.64
     But
    -2.63
    1
    -2.63
    -2.41
    Ideally
    -2.39
     :
    -2.34
    POSITIVE LOGITS
     theres
    3.09
    2.50
    2.47
    2.44
    ing
    2.44
    yelitis
    2.42
     to
    2.41
    2.31
    2.30
     декабря
    2.28
    Act Density 0.012%

    No Known Activations