INDEX
Explanations
references to a specific television network
mentions of the FOX network
New Auto-Interp
Negative Logits
ucket
-0.79
adding
-0.70
agn
-0.68
anks
-0.68
haps
-0.68
ivari
-0.67
antics
-0.67
idences
-0.67
alli
-0.66
accompanied
-0.66
POSITIVE LOGITS
FOX
1.19
conn
0.98
çīĪ
0.86
FOX
0.82
fox
0.77
DEV
0.76
Fest
0.76
NEWS
0.74
TAG
0.73
CAST
0.70
Activations Density 0.004%