INDEX
    Explanations

    phrases indicating something is recently established or newly classified

    New Auto-Interp
    Negative Logits
    ekler
    -0.19
    ’ta
    -0.16
    zon
    -0.16
    udu
    -0.15
    erken
    -0.15
    .ua
    -0.15
    AMESPACE
    -0.15
    STS
    -0.14
    ernes
    -0.14
    ota
    -0.14
    POSITIVE LOGITS
    ly
    0.18
    iber
    0.18
    bian
    0.17
    ighton
    0.17
    bies
    0.15
    mente
    0.15
    iger
    0.15
    ko
    0.14
    N
    0.14
    æļ
    0.14
    Act Density 0.014%

    No Known Activations