INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     citizens
    -0.08
     στρα
    -0.08
     olsun
    -0.08
     CONCAT
    -0.08
    市委
    -0.08
     βοη
    -0.08
     citizen
    -0.08
     cheer
    -0.07
     της
    -0.07
     invented
    -0.07
    POSITIVE LOGITS
    ][
    0.09
    waith
    0.08
    Characteristics
    0.08
     confines
    0.08
    Containing
    0.08
     characteristics
    0.08
     xog
    0.07
     faixa
    0.07
    Atr
    0.07
     الرمل
    0.07
    Act Density 0.003%

    No Known Activations