INDEX
    Explanations

    words related to trust and betrayal in a confrontational or survival context

    New Auto-Interp
    Negative Logits
     m
    -0.44
    la
    -0.42
     ante
    -0.42
    -0.42
     Do
    -0.42
    chen
    -0.41
    Ignore
    -0.41
     i
    -0.41
    ${
    -0.41
     M
    -0.40
    POSITIVE LOGITS
     للمعارف
    1.16
     للاسماء
    0.95
     تانيه
    0.95
     myſelf
    0.88
    Personensuche
    0.88
     ſche
    0.82
     themſelves
    0.81
     fhew
    0.81
     Numerade
    0.79
     poffe
    0.79
    Act Density 0.010%

    No Known Activations