INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .Set
    -0.09
    i
    -0.08
     ID
    -0.08
     lift
    -0.07
     line
    -0.07
    ise
    -0.07
    -positive
    -0.07
    DATE
    -0.07
     ×
    -0.07
     Rate
    -0.07
    POSITIVE LOGITS
     another
    0.22
    another
    0.18
     Another
    0.15
    Another
    0.13
     otro
    0.08
    0.07
     друг
    0.07
    另外
    0.07
    "a
    0.07
    0.07
    Act Density 0.029%

    No Known Activations