INDEX
    Explanations

    phrases that express personal opinions or stances on societal issues

    New Auto-Interp
    Negative Logits
    ยà¸ĩ
    -0.16
    виÑĩай
    -0.13
    éal
    -0.13
    ruptions
    -0.12
    ìĥĿëĭĺ
    -0.12
    ìĿ´ìħĺ
    -0.12
    bstract
    -0.11
    ảy
    -0.11
    .jetbrains
    -0.11
    oot
    -0.11
    POSITIVE LOGITS
     that
    0.99
     THAT
    0.90
     That
    0.85
    that
    0.84
    That
    0.82
    	that
    0.72
    _that
    0.71
    éĤ£
    0.71
    éĤ£ä¸ª
    0.65
     thats
    0.65
    Act Density 2.665%

    No Known Activations