INDEX
    Explanations

    the word "nil" with high activation values

    references to "nil" or similar terms indicating a lack of value or absence

    New Auto-Interp
    Negative Logits
    RAW
    -0.77
    ebus
    -0.75
    hen
    -0.74
    Redd
    -0.73
    eeks
    -0.72
    heny
    -0.71
    Feed
    -0.71
    hetti
    -0.71
    versions
    -0.70
    Hop
    -0.70
    POSITIVE LOGITS
    nil
    1.46
     nil
    1.28
     Nil
    0.99
    sson
    0.83
    una
    0.81
    ocent
    0.78
     null
    0.76
     NULL
    0.74
    NULL
    0.74
     indemn
    0.71
    Act Density 0.007%

    No Known Activations