INDEX
Explanations
proper names of individuals
New Auto-Interp
Negative Logits
==============================================================
-0.14
âĹĦ
-0.14
sao
-0.14
cheon
-0.13
eature
-0.13
avored
-0.13
iÄĩ
-0.13
landa
-0.13
ĵ
-0.13
ercial
-0.12
POSITIVE LOGITS
â̦↵
0.15
â̦↵
0.14
â̦”
0.14
[â̦]↵
0.14
â̦"
0.14
â̦and
0.14
Brake
0.13
æ²»
0.13
[â̦
0.13
otes
0.13
Activations Density 0.332%