INDEX
    Explanations

    phrases that indicate comparison or contrast in evaluations

    New Auto-Interp
    Negative Logits
     on
    -0.25
     عÙĦÙī
    -0.17
     trên
    -0.16
    äge
    -0.16
     на
    -0.16
    onaut
    -0.15
     auf
    -0.15
    lok
    -0.15
    erset
    -0.15
    à¸ļà¸Ļ
    -0.15
    POSITIVE LOGITS
     behalf
    0.49
     occasions
    0.34
     basis
    0.33
     occasion
    0.32
    basis
    0.29
    occasion
    0.24
    _basis
    0.23
     grounds
    0.23
     Basis
    0.22
     dime
    0.19
    Act Density 0.802%

    No Known Activations