INDEX
    Explanations

    mathematical reasoning

    New Auto-Interp
    Negative Logits
     blends
    -0.08
     blended
    -0.07
     masterpiece
    -0.07
    前三
    -0.07
    uff
    -0.07
    -0.07
    $total
    -0.07
    融合
    -0.07
    ionn
    -0.07
    925
    -0.07
    POSITIVE LOGITS
     juist
    0.11
     opposite
    0.11
     reversed
    0.10
     Conversely
    0.10
     convers
    0.09
     наоборот
    0.09
    Convers
    0.09
     contradict
    0.09
     ironic
    0.09
     contrario
    0.09
    Act Density 0.048%

    No Known Activations