INDEX
    Explanations

    phrases expressing contrasts or conditions

    New Auto-Interp
    Negative Logits
    atern
    -0.15
    ppo
    -0.14
    ksam
    -0.14
     kvin
    -0.14
    ypad
    -0.14
     additional
    -0.14
    ieux
    -0.14
     azal
    -0.13
     escorte
    -0.13
    .additional
    -0.13
    POSITIVE LOGITS
     nor
    0.20
     plenty
    0.19
     nonetheless
    0.18
    Nevertheless
    0.18
     nevertheless
    0.18
    nor
    0.16
    åį´
    0.15
    enek
    0.15
     Plenty
    0.15
     sÃŃ
    0.15
    Act Density 0.197%

    No Known Activations