INDEX
    Explanations

    names or references to individuals, particularly those with "Tina" or "Nina" in them

    New Auto-Interp
    Negative Logits
    ¤ij
    -0.16
    áte
    -0.16
    ازÙħ
    -0.16
    ýv
    -0.15
    anooga
    -0.15
    ülü
    -0.15
    wahl
    -0.15
    acyj
    -0.14
    .metamodel
    -0.14
    ttp
    -0.14
    POSITIVE LOGITS
    o
    0.20
    uer
    0.19
    amate
    0.17
    emez
    0.15
    elli
    0.15
    les
    0.15
    دارÛĮ
    0.15
    is
    0.15
    res
    0.14
     stip
    0.14
    Act Density 0.017%

    No Known Activations