INDEX
    Explanations

    emojis and symbols

    repeated characters or symbols in text

    New Auto-Interp
    Negative Logits
     princ
    -0.54
     disparate
    -0.49
     coerc
    -0.48
     Truman
    -0.47
     civilisation
    -0.46
     commissions
    -0.45
     marsh
    -0.44
     paternal
    -0.43
     inertia
    -0.43
     masters
    -0.42
    POSITIVE LOGITS
    ª
    0.78
    ľ
    0.77
    ¿
    0.75
    ı
    0.68
    IJ
    0.68
    ¬
    0.67
    ¯
    0.67
    Ĥ
    0.66
    «
    0.65
    ³
    0.65
    Act Density 0.611%

    No Known Activations