INDEX
    Explanations

    political intrigue

    New Auto-Interp
    Negative Logits
     ngon
    -0.07
     difference
    -0.07
    ан
    -0.07
     silenced
    -0.06
    لام
    -0.06
    düğü
    -0.06
    >,↵
    -0.06
    ук
    -0.06
    Ch
    -0.06
     Ingredients
    -0.06
    POSITIVE LOGITS
    :h
    0.07
    urger
    0.07
    	ERROR
    0.06
    odel
    0.06
    )._
    0.06
    /rem
    0.06
    avad
    0.06
    	now
    0.06
     carve
    0.06
    وط
    0.06
    Act Density 0.021%

    No Known Activations