INDEX
Explanations
The neuron detects mentions of organizational membership (e.g. “member of the … Party”).
New Auto-Interp
Negative Logits
anál
-0.08
เ�
-0.07
inclus
-0.07
.hadoop
-0.07
IVING
-0.07
Allocation
-0.07
heuristic
-0.07
gap
-0.06
.interface
-0.06
ctest
-0.06
POSITIVE LOGITS
don
0.07
elapsed
0.06
smlouvy
0.06
ослож
0.06
่ก
0.06
locator
0.05
violin
0.05
ETH
0.05
イス
0.05
امي
0.05
Activations Density 0.014%