INDEX
    Explanations

    expressions of certainty or affirmation

    New Auto-Interp
    Negative Logits
    /Foundation
    -0.17
    olland
    -0.17
    insula
    -0.17
    ãĤ·ãĥ§ãĥ³
    -0.15
    iram
    -0.15
    ogui
    -0.14
    abaj
    -0.14
    rending
    -0.14
     Romantic
    -0.14
    brain
    -0.14
    POSITIVE LOGITS
    antee
    0.17
    acket
    0.15
    antom
    0.15
    TEE
    0.14
    uw
    0.14
    emon
    0.14
     âĵĺ
    0.14
    eee
    0.14
    emin
    0.14
    Dst
    0.14
    Act Density 0.039%

    No Known Activations