INDEX
    Explanations

    expressions that emphasize comparison or highlight a sense of value or preference

    New Auto-Interp
    Negative Logits
     Pap
    -0.16
    êm
    -0.15
     POT
    -0.15
    itchen
    -0.15
    ronic
    -0.15
     Pert
    -0.14
    ACS
    -0.14
    sie
    -0.14
    itel
    -0.14
     Partition
    -0.14
    POSITIVE LOGITS
    Nothing
    0.21
     Nothing
    0.20
    nothing
    0.19
     NOTHING
    0.19
     nothing
    0.18
    HING
    0.17
    elda
    0.16
    reed
    0.15
     ниÑĩ
    0.15
    835
    0.14
    Act Density 0.058%

    No Known Activations