INDEX
    Explanations

    verbs and phrases indicating reasoning or justification

    New Auto-Interp
    Negative Logits
    ãģĸ
    -0.16
    uma
    -0.16
    oir
    -0.15
    çģ
    -0.14
    æĮ¯ãĤĬ
    -0.14
    ÑĤоÑĩ
    -0.14
    ziel
    -0.14
    ãĥ³ãĥĨãĤ£
    -0.14
    æ¾
    -0.14
    OrNil
    -0.14
    POSITIVE LOGITS
     sense
    0.96
     Sense
    0.78
    sense
    0.76
    Sense
    0.69
     senses
    0.59
     sentido
    0.59
     sensed
    0.40
     sens
    0.38
    ense
    0.37
    SEN
    0.33
    Act Density 0.026%

    No Known Activations