INDEX
    Explanations

    instances of the word "never" in various contexts

    New Auto-Interp
    Negative Logits
     Muffins
    -0.84
    :]:
    -0.81
     للمعارف
    -0.77
    StateToProps
    -0.75
    stdc
    -0.75
    KommentareTeilen
    -0.74
    epiece
    -0.74
    pB
    -0.74
    vidia
    -0.72
    raszam
    -0.72
    POSITIVE LOGITS
     never
    1.75
     Never
    1.71
     NEVER
    1.67
    NEVER
    1.66
    Never
    1.65
    never
    1.64
     Nunca
    1.28
     EVER
    1.26
    Nunca
    1.19
     ever
    1.18
    Act Density 0.065%

    No Known Activations