INDEX
Explanations
Support and hinderance
This neuron detects verbs that describe one factor influencing, modifying, or impeding the relationship between other elements (e.g., “moderate,” “hinder,” “interfere”).
New Auto-Interp
Negative Logits
indication
-0.06
茶
-0.06
heiß
-0.06
ética
-0.06
มาก
-0.06
зу
-0.06
freed
-0.06
while
-0.06
바
-0.06
luxurious
-0.06
POSITIVE LOGITS
estead
0.07
Vault
0.06
นาง
0.06
.lazy
0.06
Bands
0.06
continual
0.06
гір
0.06
_empresa
0.06
bung
0.06
Qu
0.06
Activations Density 0.123%