INDEX
    Explanations

    instances of a specific character or symbol within the text

    New Auto-Interp
    Negative Logits
     cat
    -0.18
     reb
    -0.17
    ĶåĽŀ
    -0.15
    fo
    -0.15
     Kat
    -0.15
    ладÑĥ
    -0.15
    ymph
    -0.15
     Coalition
    -0.15
     fool
    -0.14
    -cat
    -0.14
    POSITIVE LOGITS
    chin
    0.16
    DataExchange
    0.16
    å¸Ŀ
    0.16
    thalm
    0.15
    esse
    0.15
     PIPE
    0.15
    raith
    0.15
     Sesso
    0.14
    Ī
    0.14
    ëį°ìĿ´íĬ¸
    0.14
    Act Density 0.006%

    No Known Activations