INDEX
    Explanations

    colloquial expressions and variations of the word "lo."

    New Auto-Interp
    Negative Logits
    ãĥ³
    -0.28
    gard
    -0.21
    o
    -0.21
    g
    -0.19
    gien
    -0.18
    nul
    -0.18
    dum
    -0.18
    tod
    -0.17
    y
    -0.17
    d
    -0.17
    POSITIVE LOGITS
    ped
    0.27
    path
    0.23
    idy
    0.22
    ping
    0.22
    rent
    0.21
    ren
    0.21
    pad
    0.21
    so
    0.20
    ret
    0.20
    rem
    0.19
    Act Density 0.019%

    No Known Activations