INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Mijn
    -0.60
    是我的
    -0.60
     meine
    -0.59
    Our
    -0.59
    我が家の
    -0.58
    僕の
    -0.57
     моя
    -0.56
    pædia
    -0.56
     Our
    -0.56
     nossas
    -0.56
    POSITIVE LOGITS
     him
    3.84
    him
    2.75
     Him
    2.49
     them
    2.39
    Him
    2.37
    them
    2.16
     HIM
    2.07
     THEM
    1.91
     Them
    1.89
    HIM
    1.84
    Act Density 0.093%

    No Known Activations