INDEX
    Explanations

    personal pronouns in different languages

    New Auto-Interp
    Negative Logits
    seamnă
    -0.65
    そういった
    -0.51
     Normdatei
    -0.50
     Савезне
    -0.48
    ureusement
    -0.47
     quenching
    -0.46
     Arund
    -0.46
    betical
    -0.46
    Portale
    -0.46
     intrusions
    -0.45
    POSITIVE LOGITS
     he
    1.84
     He
    1.58
     she
    1.57
    He
    1.52
     он
    1.48
     his
    1.45
    Он
    1.38
    但他
    1.37
     himself
    1.37
     they
    1.35
    Act Density 0.062%

    No Known Activations