INDEX
    Explanations

    quantified descriptions or lists

    New Auto-Interp
    Negative Logits
    0.49
     incompat
    0.49
     আক্রমণে
    0.49
    ències
    0.47
    destroyed
    0.47
    ジャ
    0.46
     contradicts
    0.46
    ила
    0.46
    0.46
     contradict
    0.46
    POSITIVE LOGITS
    '
    0.61
    ing
    0.47
     थेट
    0.47
    Schema
    0.47
     richtig
    0.46
    .
    0.46
    Straight
    0.45
     सीधे
    0.45
     سما
    0.43
    reck
    0.42
    Act Density 0.000%

    No Known Activations