INDEX
    Explanations

    all followed by descriptive verbs

    New Auto-Interp
    Negative Logits
    או
    0.45
     or
    0.45
     hoặc
    0.44
    0.43
    ),
    0.43
    ו
    0.43
     arcs
    0.42
     অথবা
    0.41
     jede
    0.41
     manuals
    0.40
    POSITIVE LOGITS
     in
    0.41
    ay
    0.40
    ong
    0.38
    istä
    0.37
     of
    0.36
    ickey
    0.34
    0.34
     hugely
    0.34
    ati
    0.33
     जगह
    0.33
    Act Density 0.026%

    No Known Activations