INDEX
    Explanations

    phrases that describe opinions and the degrees of certainty or uncertainty regarding arguments

    New Auto-Interp
    Negative Logits
    IColor
    -0.17
    -desc
    -0.15
    ostel
    -0.14
    ï¼ģãĢį↵↵
    -0.14
    eki
    -0.14
    оÑĢов
    -0.14
    éļĨ
    -0.14
    rov
    -0.13
    reib
    -0.13
    rocket
    -0.13
    POSITIVE LOGITS
    916
    0.16
    ÙħÙĦ
    0.14
    ivet
    0.14
     pun
    0.14
    ÑĢаÑģÑĤ
    0.14
    963
    0.13
    ặng
    0.13
    äh
    0.13
    pun
    0.13
     ÑģÑĤÑĢа
    0.13
    Act Density 0.380%

    No Known Activations