INDEX
    Explanations

    references to specific entities or names

    the special character 'ĺ' in various contexts

    New Auto-Interp
    Negative Logits
     Seym
    -0.84
     disadvant
    -0.75
     condem
    -0.74
     Enlightenment
    -0.71
     pestic
    -0.70
    raints
    -0.70
     explan
    -0.70
     trainers
    -0.68
     mathemat
    -0.67
     welf
    -0.67
    POSITIVE LOGITS
    ï¸ı
    1.32
    lean
    1.01
    log
    0.92
    ģ
    0.91
    ĺ
    0.91
    ï¸
    0.89
    Ģ
    0.88
    leans
    0.85
    ļ
    0.85
    ł
    0.82
    Act Density 0.034%

    No Known Activations