INDEX
    Explanations

    phrases indicating comparison or measurement thresholds

    New Auto-Interp
    Negative Logits
    752
    -0.16
    EDIUM
    -0.15
    yw
    -0.15
     closer
    -0.14
    uild
    -0.14
     CAPITAL
    -0.14
    amedi
    -0.14
    unte
    -0.14
    ilon
    -0.14
    rix
    -0.14
    POSITIVE LOGITS
     (<
    0.23
    ONS
    0.18
    istrovstvÃŃ
    0.18
    lings
    0.15
    _iff
    0.15
    orama
    0.14
    OTAL
    0.14
    ling
    0.14
    (=)
    0.14
     ever
    0.14
    Act Density 0.038%

    No Known Activations