INDEX
    Explanations

    phrases related to suppression and inhibition

    New Auto-Interp
    Negative Logits
    Dok
    -0.71
     Peter
    -0.68
     Dok
    -0.68
     Beek
    -0.67
    ek
    -0.66
     تط
    -0.64
    -0.63
    k
    -0.62
     Kinder
    -0.62
    Peter
    -0.60
    POSITIVE LOGITS
     suppress
    1.63
     Suppression
    1.54
    SUP
    1.49
     SUP
    1.48
     suppression
    1.46
     Sup
    1.44
     suppresses
    1.43
     suppressed
    1.42
     suppressor
    1.40
     Supp
    1.38
    Act Density 0.200%

    No Known Activations