INDEX
    Explanations

    instances of the word "news" in various contexts

    New Auto-Interp
    Negative Logits
    ci
    -0.19
    ndon
    -0.18
    ноз
    -0.18
    neau
    -0.17
    ex
    -0.16
    c
    -0.15
    ransition
    -0.15
    vt
    -0.15
    zelf
    -0.15
    FLAGS
    -0.14
    POSITIVE LOGITS
    letters
    0.20
    rp
    0.19
    lever
    0.16
    rising
    0.16
    flix
    0.16
    oleÄį
    0.16
    nika
    0.15
    reader
    0.15
    lobber
    0.15
    Č
    0.15
    Act Density 0.034%

    No Known Activations