INDEX
Explanations
phrases related to contrasting situations or consequences
concepts related to winning and losing
New Auto-Interp
Negative Logits
xtap
-0.69
cffffcc
-0.68
enthusi
-0.66
mathemat
-0.65
confir
-0.64
referen
-0.64
ccording
-0.63
misunder
-0.61
ntil
-0.61
Roaming
-0.61
POSITIVE LOGITS
theirs
1.07
ours
1.02
hers
0.99
yours
0.94
;}
0.84
mine
0.81
%;
0.72
elsewhere
0.67
others
0.66
è£ı
0.65
Activations Density 0.821%