INDEX
    Explanations

    violation / breaking rules

    New Auto-Interp
    Negative Logits
    them
    -0.60
     nó
    -0.59
     them
    -0.58
     it
    -0.58
     آن
    -0.57
     antaranya
    -0.50
     or
    -0.49
     dalamnya
    -0.48
    آن
    -0.47
     ellos
    -0.47
    POSITIVE LOGITS
    oa̍t
    0.91
     незавершена
    0.77
     purpoſe
    0.72
     the
    0.70
    rungsseite
    0.68
     Италијани
    0.67
    éraire
    0.65
    ainville
    0.65
    apimachinery
    0.65
     deſt
    0.65
    Act Density 0.002%

    No Known Activations