INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Surveillance
    -0.08
     blinds
    -0.08
     aboard
    -0.07
    blind
    -0.07
     caves
    -0.07
    osur
    -0.07
     and
    -0.07
     board
    -0.07
     Tournament
    -0.07
    -sur
    -0.07
    POSITIVE LOGITS
     nouns
    0.14
     noun
    0.12
    类别
    0.10
    0.10
     grammatical
    0.10
     punctuation
    0.10
     grammat
    0.10
    0.10
     adjective
    0.10
    0.09
    Act Density 0.014%

    No Known Activations