INDEX
Explanations
mentions of geographical locations and associated words
references to social issues and institutions related to politics and education
New Auto-Interp
Negative Logits
escription
-0.63
anwhile
-0.56
tweeting
-0.56
Rhod
-0.54
Tai
-0.53
Notting
-0.51
uay
-0.51
CLOSE
-0.51
condem
-0.51
!:
-0.51
POSITIVE LOGITS
differs
1.24
varies
1.17
exceeds
1.17
depends
1.17
stems
1.12
depended
1.09
hinges
1.08
outweigh
1.05
correlates
1.04
resembles
1.03
Activations Density 0.313%