INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     I
    1.26
    .
    1.12
     (
    1.05
    .”
    1.01
    .’
    0.99
    .\
    0.91
    ill
    0.88
     U
    0.87
    ların
    0.87
    .<
    0.86
    POSITIVE LOGITS
    ção
    0.99
    0.84
    ן
    0.84
    to
    0.80
    ە
    0.80
     governador
    0.79
     कैंडिडेट्स
    0.77
     допомогти
    0.77
     নেতাকর্মীরা
    0.77
    0.76
    Act Density 0.002%

    No Known Activations