INDEX
    Explanations

    unique characteristics or attributes of individuals and entities

    New Auto-Interp
    Negative Logits
     deutsche
    -0.22
    legt
    -0.19
     neue
    -0.19
     erste
    -0.18
     junge
    -0.18
     weitere
    -0.18
    تاÙĨ
    -0.18
     kleine
    -0.18
    isches
    -0.18
    иÑĩеÑģÑĤво
    -0.18
    POSITIVE LOGITS
    ischen
    0.41
    lichen
    0.40
    uellen
    0.39
    enden
    0.38
    genden
    0.35
    igen
    0.34
    utschen
    0.34
    ierten
    0.34
    enen
    0.34
    eren
    0.33
    Act Density 0.026%

    No Known Activations