INDEX
    Explanations

    repeated names or references to individuals

    New Auto-Interp
    Negative Logits
     offline
    -0.74
     Korra
    -0.73
     Manip
    -0.68
     overseas
    -0.68
     wow
    -0.67
     Important
    -0.66
     Yoga
    -0.64
     Naruto
    -0.63
     Rebirth
    -0.63
     numbered
    -0.63
    POSITIVE LOGITS
    isner
    1.55
    idel
    1.35
    iber
    1.32
    ffer
    1.27
    iman
    1.26
    hner
    1.26
    aney
    1.24
    isel
    1.23
    cker
    1.20
    hn
    1.19
    Act Density 0.081%

    No Known Activations