INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    SBATCH
    -0.57
    dafx
    -0.56
     avoient
    -0.55
    URLException
    -0.53
     Gleaner
    -0.53
    enoord
    -0.50
     wuß
    -0.49
     toalha
    -0.49
    ThemeProvider
    -0.49
    Graeme
    -0.48
    POSITIVE LOGITS
     inside
    2.13
    inside
    2.08
    Inside
    2.02
     Inside
    1.96
     INSIDE
    1.86
    INSIDE
    1.66
     внутри
    1.40
     داخل
    1.30
     dentro
    1.28
    Dentro
    1.19
    Act Density 0.009%

    No Known Activations