INDEX
    Explanations

    phrases related to personal opinions and reflections on experiences

    New Auto-Interp
    Negative Logits
    彼の
    -0.89
    him
    -0.76
    彼女の
    -0.75
     dessen
    -0.69
    them
    -0.67
    Его
    -0.67
    彼が
    -0.67
    把他
    -0.65
     ihn
    -0.65
    将他
    -0.65
    POSITIVE LOGITS
     he
    0.96
     she
    0.75
     they
    0.63
     we
    0.56
    actéristi
    0.52
    bildēt
    0.52
     you
    0.51
     mình
    0.50
    0.46
     tantôt
    0.45
    Act Density 2.044%

    No Known Activations