INDEX
    Explanations

    phrases indicating belief or conviction

    New Auto-Interp
    Negative Logits
    ãĥ³ãĥĨãĤ£
    -0.18
    resh
    -0.17
    igo
    -0.16
    hof
    -0.14
    aq
    -0.14
    ër
    -0.14
     Stern
    -0.14
    plements
    -0.14
    £p
    -0.14
    chin
    -0.13
    POSITIVE LOGITS
    atatype
    0.18
    452
    0.15
    auté
    0.14
    957
    0.14
    chema
    0.14
    adro
    0.14
    649
    0.13
    difference
    0.13
    å·®
    0.13
    LOB
    0.13
    Act Density 0.024%

    No Known Activations