INDEX
    Explanations

    words containing the substring "ac" with varying activation strengths

    prefixes that indicate action or occurrence

    New Auto-Interp
    Negative Logits
    SHIP
    -0.80
     shorth
    -0.77
     Piper
    -0.76
     Spa
    -0.74
    è¦ļéĨĴ
    -0.73
     Shots
    -0.72
    ©¶æ¥µ
    -0.71
     Painter
    -0.70
     Christensen
    -0.70
     POW
    -0.68
    POSITIVE LOGITS
    ception
    1.20
    prise
    1.05
    istant
    1.04
    ertain
    1.01
    pect
    1.00
    mosp
    0.97
    vance
    0.95
    ploy
    0.94
    ighty
    0.93
    usterity
    0.93
    Act Density 0.111%

    No Known Activations