INDEX
Explanations
This neuron activates on words and phrases where the speaker states or defines their own role, expertise, or qualifications.
New Auto-Interp
Negative Logits
_dataset
-0.07
tablets
-0.07
theast
-0.06
Anna
-0.06
774
-0.06
Pri
-0.06
east
-0.06
CoreData
-0.06
Nova
-0.06
قرآن
-0.06
POSITIVE LOGITS
абсолют
0.07
others
0.06
,↵
0.06
fiercely
0.06
-trade
0.06
↵
0.06
мови
0.06
Juni
0.06
ilebilir
0.06
plea
0.06
Activations Density 0.021%