INDEX
    Explanations

    phrases related to issues of disagreement or conflict

    New Auto-Interp
    Negative Logits
    756
    -0.16
    allis
    -0.14
    775
    -0.14
    .SE
    -0.14
    CanBe
    -0.13
    785
    -0.13
     Hilton
    -0.13
     Hern
    -0.13
     IDEOGRAPH
    -0.13
     èŃ
    -0.13
    POSITIVE LOGITS
     moda
    0.15
    ôm
    0.15
    izza
    0.15
    æ³¥
    0.14
    ibble
    0.14
    amet
    0.14
    ìĪĻ
    0.14
     destin
    0.14
    лÑĥÑĩ
    0.14
     mod
    0.14
    Act Density 0.168%

    No Known Activations