INDEX
    Explanations

    the word "No" in various contexts and formats

    New Auto-Interp
    Negative Logits
    -0.85
     imagui
    -0.76
    <pad>
    -0.73
    𑄮
    -0.73
    󠁁
    -0.72
    AISSEE
    -0.72
    <unused52>
    -0.72
    <unused14>
    -0.72
    <unused8>
    -0.72
    <unused3>
    -0.72
    POSITIVE LOGITS
    No
    0.51
     No
    0.38
    NO
    0.34
    Neither
    0.33
     neither
    0.32
    URL
    0.31
    RE
    0.29
     image
    0.29
     Neither
    0.29
     no
    0.28
    Act Density 0.047%

    No Known Activations