INDEX
    Explanations

    expressions of exclamation or punctuation

    New Auto-Interp
    Negative Logits
    shaw
    -0.15
     Surprise
    -0.14
     rig
    -0.14
     Moore
    -0.14
    cribed
    -0.14
    orch
    -0.14
    smith
    -0.13
    ÙĨع
    -0.13
     Instrument
    -0.13
    eters
    -0.13
    POSITIVE LOGITS
    chw
    0.15
     Dai
    0.14
    elden
    0.14
    eva
    0.14
    à¸Ĺย
    0.14
    имÑĥ
    0.14
    elters
    0.13
    ean
    0.13
    /control
    0.13
    óst
    0.13
    Act Density 0.141%

    No Known Activations