INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ما
    -0.09
     fundament
    -0.08
    ផល
    -0.07
    isuus
    -0.07
    etzung
    -0.07
     frecu
    -0.07
     objection
    -0.07
    -0.07
    ket
    -0.07
     Harper
    -0.07
    POSITIVE LOGITS
     treated
    0.08
    0.08
    ადა
    0.08
     tarvitse
    0.08
     gjorde
    0.08
    -Pacific
    0.08
    <m
    0.07
    *p
    0.07
     қойған
    0.07
     мона
    0.07
    Act Density 0.015%

    No Known Activations