INDEX
Explanations
figurative language
This neuron detects instructions about using “figures of speech.”
New Auto-Interp
Negative Logits
紙
-0.06
nella
-0.06
ッシュ
-0.06
annoying
-0.06
fon
-0.06
tingham
-0.06
ประกาศ
-0.06
оны
-0.06
_MISS
-0.06
ganze
-0.06
POSITIVE LOGITS
ΗΡ
0.07
getInstance
0.07
trí
0.07
Вот
0.07
popover
0.06
odal
0.06
Deg
0.06
resembled
0.06
strategies
0.06
홈
0.06
Activations Density 0.011%