INDEX
    Explanations

    words or phrases related to negation or absence

    New Auto-Interp
    Negative Logits
    iant
    -0.16
    dale
    -0.15
    nen
    -0.15
    onest
    -0.15
    εί
    -0.15
    eline
    -0.14
    ernetes
    -0.14
     Macros
    -0.14
    reet
    -0.14
    esto
    -0.14
    POSITIVE LOGITS
    olian
    0.24
    ither
    0.22
     lect
    0.21
    aten
    0.20
     vents
    0.19
     xp
    0.19
    asier
    0.19
    ager
    0.19
    uron
    0.19
    vidence
    0.18
    Act Density 0.051%

    No Known Activations