INDEX
    Explanations

    references to gossip or rumors

    New Auto-Interp
    Negative Logits
    adows
    -0.16
    arth
    -0.16
    bé
    -0.16
    ARTH
    -0.15
    aylor
    -0.15
    erna
    -0.15
     wire
    -0.14
    lıģa
    -0.14
    icts
    -0.14
    ipc
    -0.14
    POSITIVE LOGITS
    oured
    0.32
    blings
    0.31
    our
    0.29
    ination
    0.28
    pled
    0.26
    bling
    0.25
    ours
    0.25
    mage
    0.24
    ble
    0.24
    inate
    0.23
    Act Density 0.003%

    No Known Activations