INDEX
    Explanations

    words that suggest strong emotions or social dynamics

    New Auto-Interp
    Negative Logits
    ppard
    -0.18
    ayed
    -0.17
     Friedrich
    -0.15
    rint
    -0.15
    pper
    -0.14
    RNA
    -0.14
    aying
    -0.14
    adır
    -0.14
    ening
    -0.14
    IFF
    -0.14
    POSITIVE LOGITS
    achs
    0.17
    .AddListener
    0.16
    νι
    0.16
    mai
    0.15
    alli
    0.15
    tam
    0.14
    ains
    0.14
    ìĽħ
    0.14
    ekim
    0.14
    .ActionListener
    0.13
    Act Density 0.002%

    No Known Activations