INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     OBITUARY
    -0.81
     Coder
    -0.77
     dedos
    -0.77
     do
    -0.76
    ASU
    -0.75
     Qatar
    -0.74
     lambda
    -0.73
    ינו
    -0.72
    schirm
    -0.72
     Tweede
    -0.72
    POSITIVE LOGITS
     Int
    1.10
    Int
    0.99
    describing
    0.98
     terminator
    0.86
    exactly
    0.86
    itando
    0.80
     exactly
    0.79
    stance
    0.79
     Double
    0.78
    animaux
    0.78
    Act Density 0.042%

    No Known Activations