INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Brill
    -0.08
     vigorously
    -0.07
    nyder
    -0.07
    Modern
    -0.07
    Id
    -0.06
     ald
    -0.06
     multid
    -0.06
     Modern
    -0.06
     freq
    -0.06
    id
    -0.06
    POSITIVE LOGITS
     escape
    0.16
     escaped
    0.13
     escapes
    0.13
     Escape
    0.12
     escap
    0.11
    Escape
    0.11
    escape
    0.10
     escaping
    0.10
    scape
    0.09
    tam
    0.08
    Act Density 0.009%

    No Known Activations