INDEX
Explanations
organization
references to the organization and training background of a language model.
This neuron detects the occurrence of the token “Organization” (i.e. mentions of that specific named institution).
New Auto-Interp
Negative Logits
BEST
-0.06
Grand
-0.06
LCD
-0.06
.Android
-0.06
IE
-0.06
_projection
-0.06
Nation
-0.06
AntiForgeryToken
-0.06
áč
-0.06
cannons
-0.06
POSITIVE LOGITS
perpetrated
0.07
RegExp
0.07
nesota
0.07
_declaration
0.07
Alibaba
0.07
setOpen
0.07
obsess
0.06
ödem
0.06
První
0.06
/*/
0.06
Activations Density 0.001%