INDEX
    Explanations

    words indicating similarity or comparison

    New Auto-Interp
    Negative Logits
    /cache
    -0.15
    amient
    -0.14
    phere
    -0.14
     Kak
    -0.13
     kle
    -0.13
    ker
    -0.13
     Ti
    -0.13
     kiss
    -0.13
     Cached
    -0.13
    bash
    -0.13
    POSITIVE LOGITS
    eldo
    0.18
    ÑĸÑĢ
    0.16
     Yön
    0.15
    -Sah
    0.15
     YYS
    0.15
    tones
    0.15
    /******/
    0.15
    /of
    0.14
    _literals
    0.14
     salopes
    0.14
    Act Density 0.183%

    No Known Activations