INDEX
Explanations
references to fans and followers in varied contexts
references to sports fans and their emotional responses
New Auto-Interp
Negative Logits
Kills
-0.73
hua
-0.66
KO
-0.60
expensive
-0.59
Theft
-0.58
Favorite
-0.57
Killed
-0.56
Defeat
-0.56
Mechanical
-0.55
Virgin
-0.55
POSITIVE LOGITS
finally
1.16
anew
1.01
suddenly
1.00
wondered
1.00
poised
0.96
understandably
0.93
resumed
0.90
beck
0.87
urgently
0.87
wondering
0.86
Activations Density 0.513%