INDEX
    Explanations

    elements related to historical or cultural artifacts

    New Auto-Interp
    Negative Logits
    Datuak
    -0.50
     POLITICS
    -0.50
    上角
    -0.48
    spreis
    -0.47
    منابع
    -0.47
     Politics
    -0.46
    Politics
    -0.45
     keduanya
    -0.45
     миллионов
    -0.45
    ม้
    -0.45
    POSITIVE LOGITS
     each
    1.13
     poszczegól
    1.00
    Each
    0.97
     Each
    0.97
    each
    0.95
    分別
    0.91
     different
    0.89
     Chaque
    0.85
    不同的
    0.85
    EACH
    0.84
    Act Density 0.452%

    No Known Activations