INDEX
Explanations
abstract concepts
The neuron activates on words denoting requirements or necessities (e.g. “needs,” “accessibility,” “needed”).
New Auto-Interp
Negative Logits
(conf
-0.07
fclose
-0.06
рекоменду
-0.06
WLAN
-0.06
訴
-0.06
(student
-0.06
_release
-0.06
_One
-0.06
fclose
-0.06
ิหาร
-0.06
POSITIVE LOGITS
Mailer
0.06
ẩ
0.06
peon
0.06
sit
0.06
ิทย
0.06
Derm
0.06
embark
0.06
الد
0.06
lol
0.06
uiten
0.06
Activations Density 0.441%