INDEX
Explanations
Latin characters and specific combinations of characters, possibly related to a specific language or encoding
special character tokens or specific linguistic symbols
New Auto-Interp
Negative Logits
hyde
-0.89
aged
-0.74
assies
-0.70
ngth
-0.68
humming
-0.67
aging
-0.66
lain
-0.64
mathemat
-0.64
ipolar
-0.64
avis
-0.63
POSITIVE LOGITS
×Ļ×
1.03
IJ
1.01
ת
1.01
ãĤī
0.99
׾
0.98
×ķ
0.94
κ
0.92
ä¸ī
0.86
ãĥ¼
0.79
ãĥĥ
0.79
Activations Density 0.017%