INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ولت
    0.68
     escalator
    0.61
    지가
    0.60
     βασ
    0.60
     আন্দোলন
    0.59
    0.59
     som
    0.58
    জমেন্ট
    0.58
     Hora
    0.57
    Har
    0.57
    POSITIVE LOGITS
     ahead
    4.71
     Ahead
    4.21
    ahead
    3.95
    Ahead
    3.89
    переди
    1.68
    1.68
     вперед
    1.67
     delante
    1.64
     davanti
    1.64
     پیش
    1.58
    Act Density 0.006%

    No Known Activations