INDEX
    Explanations

    phrases indicating quantities or degrees of comparison

    New Auto-Interp
    Negative Logits
    anio
    -0.20
    çĶļèĩ³
    -0.16
     hatta
    -0.15
    itzer
    -0.15
     either
    -0.15
     tháºŃm
    -0.14
     simply
    -0.14
     actually
    -0.14
     EVEN
    -0.14
     even
    -0.14
    POSITIVE LOGITS
     according
    0.21
     until
    0.21
    until
    0.19
     ones
    0.19
    Until
    0.19
     Until
    0.18
     partially
    0.17
     partly
    0.17
    according
    0.16
    ened
    0.16
    Act Density 0.024%

    No Known Activations