INDEX
Explanations
mentions of fear or expressions of being afraid/anxious.
The neuron responds to occurrences of the word “fear,” effectively detecting expressions of fear.
New Auto-Interp
Negative Logits
frustrations
0.64
impatience
0.63
frust
0.59
শোক
0.59
frustration
0.58
annoyance
0.58
тей
0.57
экране
0.56
ढूंढ
0.55
Honorary
0.55
POSITIVE LOGITS
mong
0.93
lest
0.84
heights
0.80
repr
0.75
repris
0.72
repr
0.72
lessly
0.70
fully
0.69
Unknown
0.69
mong
0.68
Activations Density 0.038%