INDEX
    Explanations

    phrases related to self-reflection and introspection

    New Auto-Interp
    Negative Logits
     maksi
    -0.83
     recev
    -0.76
     Keny
    -0.76
    timately
    -0.75
     vettoriale
    -0.70
     sopr
    -0.68
     évit
    -0.66
     azzurro
    -0.66
     seksi
    -0.66
     keramik
    -0.66
    POSITIVE LOGITS
     sort
    0.50
    atguigu
    0.49
    spesies
    0.48
    cassert
    0.46
    كتوبر
    0.45
     maybe
    0.43
    пиона
    0.43
    gelopen
    0.43
     hashlib
    0.42
    frase
    0.42
    Act Density 0.309%

    No Known Activations