INDEX
    Explanations

    references to groups of individuals

    New Auto-Interp
    Negative Logits
    resa
    -0.15
    anta
    -0.15
    onna
    -0.14
    ament
    -0.14
    aryl
    -0.14
    ental
    -0.13
    arna
    -0.13
    anto
    -0.13
    entin
    -0.13
    ANTA
    -0.13
    POSITIVE LOGITS
    enger
    0.15
    lut
    0.15
    asaki
    0.15
    нил
    0.14
    orz
    0.14
    νÏĮ
    0.14
     Kramer
    0.14
    dür
    0.13
    prot
    0.13
    asu
    0.13
    Act Density 0.016%

    No Known Activations