INDEX
    Explanations

    alarming, threatened, or debated situations

    New Auto-Interp
    Negative Logits
     innymi
    0.51
     شرطونه
    0.44
     szyb
    0.43
     جوړونکي
    0.42
     drugih
    0.42
    0.42
     будете
    0.42
     sytu
    0.40
     پیسې
    0.40
     буде
    0.40
    POSITIVE LOGITS
    s
    0.66
    t
    0.50
     to
    0.50
    in
    0.49
    se
    0.44
    0.43
    f
    0.42
     från
    0.41
    with
    0.40
    to
    0.40
    Act Density 0.000%

    No Known Activations