INDEX
Explanations
both sports-related and news-related terms
specific media outlet names and their related terms
New Auto-Interp
Negative Logits
gui
-0.66
oru
-0.64
stride
-0.63
kidding
-0.62
peat
-0.62
hower
-0.60
pinch
-0.60
incoming
-0.60
succeeded
-0.60
practicable
-0.59
POSITIVE LOGITS
ageddon
0.73
Franch
0.73
Sport
0.72
Solutions
0.72
Thrones
0.69
Nerd
0.68
Innov
0.68
idon
0.67
LLC
0.66
Berman
0.65
Activations Density 0.344%