INDEX
Explanations
emojis and symbols
repeated characters or symbols in text
New Auto-Interp
Negative Logits
princ
-0.54
disparate
-0.49
coerc
-0.48
Truman
-0.47
civilisation
-0.46
commissions
-0.45
marsh
-0.44
paternal
-0.43
inertia
-0.43
masters
-0.42
POSITIVE LOGITS
ª
0.78
ľ
0.77
¿
0.75
ı
0.68
IJ
0.68
¬
0.67
¯
0.67
Ĥ
0.66
«
0.65
³
0.65
Activations Density 0.611%