INDEX
Explanations
phrases related to self-esteem and confidence
New Auto-Interp
Negative Logits
alse
-0.07
icer
-0.07
reta
-0.07
ward
-0.07
oldt
-0.07
uden
-0.07
oler
-0.07
uns
-0.06
exus
-0.06
rio
-0.06
POSITIVE LOGITS
/conf
0.08
plib
0.08
/self
0.07
-confidence
0.07
confidence
0.07
confidence
0.07
levels
0.07
anggal
0.07
andır
0.06
ÑĢаÑĩ
0.06
Activations Density 0.006%