INDEX
Explanations
specific nouns, especially those related to objects or entities
New Auto-Interp
Negative Logits
itel
-0.17
ovel
-0.16
213
-0.16
152
-0.16
lian
-0.15
Ãĸl
-0.15
Canter
-0.14
rer
-0.14
propTypes
-0.14
ÄĻki
-0.14
POSITIVE LOGITS
-none
0.16
/rfc
0.15
اسÙĩ
0.14
İZ
0.14
noop
0.14
iben
0.14
knife
0.14
eway
0.14
екÑĤи
0.14
falling
0.14
Activations Density 0.023%