INDEX
Explanations
phrases or fragments containing special characters such as symbols and punctuation
repeated instances of specific characters or symbols
New Auto-Interp
Negative Logits
vom
-0.70
maxim
-0.68
scattering
-0.68
microphone
-0.67
blond
-0.67
decomp
-0.66
pity
-0.66
cane
-0.66
minim
-0.66
romy
-0.65
POSITIVE LOGITS
£
1.17
ħ
1.07
¬
1.06
¹
1.04
Ĵ
1.01
ı
0.98
į
0.96
º
0.95
¦
0.93
®
0.92
Activations Density 0.283%