INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     BoxFit
    -0.55
    Kok
    -0.51
    ussis
    -0.48
     Ferrer
    -0.47
     Brum
    -0.46
     Barbier
    -0.46
     TNT
    -0.46
     Kok
    -0.46
    BCC
    -0.45
    dolce
    -0.45
    POSITIVE LOGITS
     Kyle
    2.08
    Kyle
    1.95
    kyle
    1.30
    YLE
    0.83
    yle
    0.77
     Lyle
    0.69
     Myles
    0.60
     Ryan
    0.58
     Ryo
    0.57
     violación
    0.56
    Act Density 0.002%

    No Known Activations