INDEX
    Explanations

    mentions of lost items or pets

    New Auto-Interp
    Negative Logits
    aza
    -0.17
    hiro
    -0.17
    _ALIGNMENT
    -0.15
    ods
    -0.15
     ì¡
    -0.15
    uelles
    -0.14
    urf
    -0.14
    beth
    -0.14
    forced
    -0.14
    erais
    -0.13
    POSITIVE LOGITS
    inton
    0.15
     voc
    0.15
    ози
    0.14
    omit
    0.14
    _pemb
    0.14
    ·»
    0.14
     dist
    0.14
    .rpm
    0.14
     dil
    0.14
     Brock
    0.14
    Act Density 0.141%

    No Known Activations