INDEX
    Explanations

    instances of the word "Nothing."

    New Auto-Interp
    Negative Logits
     more
    -0.54
     other
    -0.53
     nice
    -0.50
     large
    -0.49
     even
    -0.49
     BoxDecoration
    -0.49
    -0.47
     small
    -0.46
    saraba
    -0.46
     quality
    -0.46
    POSITIVE LOGITS
     Nothing
    1.15
     Anything
    1.13
     Something
    1.11
     Such
    1.09
    Nothing
    1.08
     Things
    1.07
     Any
    1.06
    Something
    1.02
     Anybody
    1.01
    Anything
    0.96
    Act Density 0.171%

    No Known Activations