INDEX
    Explanations

    references to a specific character, predominantly focusing on their actions and qualities

    New Auto-Interp
    Negative Logits
    hommes
    -0.61
     partial
    -0.60
    getError
    -0.59
     Drago
    -0.57
     ín
    -0.56
     ados
    -0.56
     Frick
    -0.56
    uksi
    -0.55
     options
    -0.55
    partial
    -0.55
    POSITIVE LOGITS
     she
    1.94
    She
    1.80
    she
    1.72
     She
    1.69
     SHE
    1.48
    SHE
    1.42
     shes
    1.40
     herself
    1.37
    但她
    1.11
     her
    1.10
    Act Density 0.052%

    No Known Activations