INDEX
    Explanations

    specific phrases indicating locations or contexts

    New Auto-Interp
    Negative Logits
    cheon
    -1.54
    neck
    -1.49
    amble
    -1.44
    SEC
    -1.41
    oplasma
    -1.38
    iplex
    -1.37
    ]):
    -1.36
    icorn
    -1.34
    uber
    -1.33
    ouin
    -1.31
    POSITIVE LOGITS
    ı
    3.63
    Ĩ
    3.44
    ĥ½
    3.43
    Ļ
    3.41
    Ģ
    3.40
    ¹
    3.28
    ¾
    3.24
    ¤
    3.16
    ħ
    3.15
    İ
    3.10
    Act Density 0.056%

    No Known Activations