INDEX
    Explanations

    scoring goals or runs/drives

    New Auto-Interp
    Negative Logits
     понимать
    0.77
     BEEN
    0.76
     nėra
    0.75
     appartiennent
    0.75
     ያላቸው
    0.73
     практики
    0.72
     devotes
    0.71
     હંમે
    0.71
    を持っている
    0.71
     હોય
    0.70
    POSITIVE LOGITS
     angrily
    0.95
     tentatively
    0.90
     şi
    0.88
     then
    0.88
    引发
    0.86
     и
    0.85
     exclaimed
    0.85
    0.84
     with
    0.84
     ş
    0.81
    Act Density 0.009%

    No Known Activations