INDEX
    Explanations

    references to physical appearances and clothing

    New Auto-Interp
    Negative Logits
    ifo
    -0.16
    .ce
    -0.16
    udit
    -0.15
     æŃ
    -0.14
    veral
    -0.14
    arga
    -0.14
    ound
    -0.14
    alem
    -0.14
    losure
    -0.13
    abras
    -0.13
    POSITIVE LOGITS
     Charge
    0.16
    kili
    0.16
    undy
    0.15
     McGu
    0.15
     poste
    0.15
     whom
    0.15
    éľ²åĩº
    0.15
    Role
    0.15
    tog
    0.15
     charge
    0.15
    Act Density 0.159%

    No Known Activations