INDEX
    Explanations

    instances of demonstrative and definite articles, indicating a focus on specific concepts or entities

    New Auto-Interp
    Negative Logits
    Äį
    -0.14
    yn
    -0.14
     mne
    -0.14
    ule
    -0.13
    _PRIV
    -0.13
     note
    -0.13
    ens
    -0.13
    jin
    -0.13
    uer
    -0.13
     sacrifice
    -0.13
    POSITIVE LOGITS
    ulumi
    0.16
    icious
    0.16
    ìĽĥ
    0.16
     mohla
    0.14
    arsi
    0.14
    ưỡng
    0.14
    icari
    0.14
    ì¶ķ
    0.14
     konkrét
    0.14
    fois
    0.14
    Act Density 0.277%

    No Known Activations