INDEX
Explanations
words related to receiving some type of recognition or honor
references to awards and recognition
New Auto-Interp
Negative Logits
loo
-0.69
ths
-0.68
Occupations
-0.68
\/\/
-0.67
Klu
-0.67
Pastebin
-0.66
Sins
-0.65
STD
-0.64
ãĥ¯ãĥ³
-0.63
ullivan
-0.62
POSITIVE LOGITS
awards
1.15
award
1.07
laure
1.05
awarded
0.99
prizes
0.93
awarding
0.91
Winner
0.90
winner
0.89
prize
0.89
winning
0.87
Activations Density 0.013%