INDEX
Explanations
phrases related to success or victory
occurrences of the word "winning"
New Auto-Interp
Negative Logits
repre
-0.72
rouch
-0.71
alam
-0.70
geist
-0.70
anium
-0.69
heter
-0.69
erity
-0.67
aryn
-0.66
trak
-0.66
thening
-0.64
POSITIVE LOGITS
bidder
0.86
streak
0.85
streaks
0.78
ners
0.66
Citation
0.66
prizes
0.65
throp
0.64
trophies
0.64
percentages
0.64
Owl
0.64
Activations Density 0.024%