INDEX
Explanations
we stand beneath
The neuron detects emphatic, declarative claims—strong assertions or superlative statements that stress ability, uniqueness, or certainty.
New Auto-Interp
Negative Logits
metabolismo
0.55
roast
0.50
saudável
0.50
Similarity
0.49
Wearing
0.47
healthier
0.46
لاج
0.46
ष्मा
0.46
Healthy
0.46
Genre
0.46
POSITIVE LOGITS
தங்கள்
0.43
incott
0.43
ಲಿಲ್ಲ
0.42
quickly
0.42
tų
0.42
ਸਿੰਘ
0.40
ství
0.39
පි
0.39
dan
0.39
գ
0.38
Activations Density 0.006%