INDEX
    Explanations

    quantifiers and descriptive phrases

    New Auto-Interp
    Negative Logits
     ngunit
    0.28
     whose
    0.26
     nidd
    0.24
     그리고
    0.24
     (*
    0.24
    》,
    0.23
     இந்நிலையில்
    0.22
     נישט
    0.22
     maß
    0.22
    0.22
    POSITIVE LOGITS
    人都
    0.29
    多くの
    0.24
    人は
    0.23
     लोग
    0.22
     experts
    0.22
    會有
    0.22
     companies
    0.22
     proponents
    0.22
     многое
    0.21
    0.21
    Act Density 0.334%

    No Known Activations