INDEX
Explanations
references to feeling proud or expressing pride
expressions of pride
New Auto-Interp
Negative Logits
PF
-0.79
apter
-0.73
itamin
-0.68
bite
-0.62
CPU
-0.62
vati
-0.62
intestinal
-0.61
Webster
-0.61
mosquit
-0.60
halting
-0.60
POSITIVE LOGITS
edIn
0.91
hon
0.84
edly
0.81
roud
0.77
ledged
0.75
kson
0.73
sburg
0.72
faced
0.71
proud
0.71
citiz
0.70
Activations Density 0.013%