INDEX
    Explanations

    words related to political or ideological concepts, particularly those involving strong opinions or stances

    instances of the character "ĺ"

    New Auto-Interp
    Negative Logits
     Seym
    -0.77
    enegger
    -0.71
     mathemat
    -0.71
     trainers
    -0.68
     therap
    -0.67
     Niet
    -0.66
    è¦ļéĨĴ
    -0.66
     intrins
    -0.66
     Reincarn
    -0.66
     ivory
    -0.66
    POSITIVE LOGITS
    ï¸ı
    1.07
    lean
    0.93
    log
    0.87
    ATH
    0.83
    £
    0.82
    ĺ
    0.81
    âĹ¼
    0.80
    fter
    0.79
    resent
    0.79
    leans
    0.78
    Act Density 0.037%

    No Known Activations