INDEX
    Explanations

    specific proper nouns, particularly names related to people or organizations

    New Auto-Interp
    Negative Logits
    utow
    -0.16
    boa
    -0.15
    ittel
    -0.15
    ruh
    -0.15
    a
    -0.14
    bus
    -0.14
    oins
    -0.14
    UEL
    -0.14
    eos
    -0.14
    arrants
    -0.14
    POSITIVE LOGITS
    ÅĦst
    0.20
    ella
    0.20
    uper
    0.20
    stry
    0.19
    ÃŃses
    0.19
    оло
    0.19
    lettes
    0.19
    olo
    0.19
    ired
    0.18
    ige
    0.18
    Act Density 0.011%

    No Known Activations