INDEX
    Explanations

    phrases implying comparison or alternatives

    New Auto-Interp
    Negative Logits
     other
    -0.26
     autre
    -0.20
     others
    -0.20
     otherwise
    -0.20
     Other
    -0.19
     OTHER
    -0.18
    ãģĿãģ®ä»ĸ
    -0.18
     altri
    -0.17
     autres
    -0.17
    other
    -0.17
    POSITIVE LOGITS
     besides
    0.22
    -than
    0.20
    vier
    0.19
    bes
    0.19
    world
    0.18
    ewise
    0.17
     niż
    0.17
    WISE
    0.17
    /new
    0.17
    _than
    0.16
    Act Density 0.015%

    No Known Activations