INDEX
Explanations
words related to political events or governmental issues
specific adjectives and adverbs that convey intensity
New Auto-Interp
Negative Logits
thumbnails
-0.68
ADRA
-0.66
watershed
-0.65
vigilance
-0.62
nonex
-0.61
seriousness
-0.61
corrid
-0.61
linkage
-0.60
ultras
-0.59
bowel
-0.59
POSITIVE LOGITS
ndra
0.81
yang
0.72
yo
0.72
eous
0.71
Unloaded
0.68
inki
0.67
hesive
0.67
ymm
0.65
yu
0.64
hot
0.61
Activations Density 0.076%