INDEX
    Explanations

    phrases that indicate measurement, evaluation, or description of particular conditions or qualities

    New Auto-Interp
    Negative Logits
    xab
    -0.07
    ALER
    -0.07
    ského
    -0.06
    allenge
    -0.06
    alla
    -0.06
    mina
    -0.06
    æĪ¸
    -0.06
    алог
    -0.06
    idar
    -0.06
     advent
    -0.06
    POSITIVE LOGITS
     usually
    0.09
     often
    0.09
     mention
    0.09
     either
    0.09
    variably
    0.09
    Usually
    0.09
    often
    0.09
    usually
    0.09
    Often
    0.08
    mention
    0.08
    Act Density 0.031%

    No Known Activations