INDEX
    Explanations

    expressions of desire or the word "want"

    New Auto-Interp
    Negative Logits
    .localization
    -0.16
    ussen
    -0.15
    _cs
    -0.14
    antro
    -0.14
    PLE
    -0.14
    ÏĥοÏħ
    -0.14
    бом
    -0.14
     Kaw
    -0.14
    PIO
    -0.13
    illi
    -0.13
    POSITIVE LOGITS
    ä¸įåΰ
    0.16
    full
    0.14
    oco
    0.14
    ÏĩÏİ
    0.14
    llvm
    0.14
    aida
    0.14
    anggal
    0.13
     todo
    0.13
    aget
    0.13
    olved
    0.13
    Act Density 0.070%

    No Known Activations