INDEX
    Explanations

    repetitive mentions of the word "left."

    New Auto-Interp
    Negative Logits
    styleable
    -0.65
    
    -0.60
    withOpacity
    -0.57
    łaszcza
    -0.57
    uremberg
    -0.56
     Diener
    -0.56
     noemen
    -0.55
     miệng
    -0.54
    verifyException
    -0.53
     Reihen
    -0.52
    POSITIVE LOGITS
     left
    4.70
     Left
    3.78
    left
    3.55
    Left
    3.50
     LEFT
    3.47
    LEFT
    2.97
    2.46
     lef
    2.41
     izquierda
    2.31
     左
    2.20
    Act Density 0.116%

    No Known Activations