INDEX
Explanations
mentions of political figures and their affiliations
references to individuals identified by their titles or roles
New Auto-Interp
Negative Logits
bugs
-0.79
Episode
-0.77
ADVERTISEMENT
-0.73
andom
-0.68
lift
-0.67
advertisement
-0.67
GROUP
-0.67
][/
-0.66
Characters
-0.65
pool
-0.65
POSITIVE LOGITS
longtime
1.29
staunch
1.24
lifelong
1.19
former
1.16
devout
1.12
native
1.11
descendant
1.08
fixture
1.07
proponent
1.07
perennial
1.06
Activations Density 0.126%