INDEX
Explanations
phrases related to patriotic or defensive sentiments towards one's country
references to national defense and patriotism
New Auto-Interp
Negative Logits
SPONSORED
-0.67
Quant
-0.66
strat
-0.62
uning
-0.60
estimating
-0.58
Likely
-0.57
elim
-0.56
abrupt
-0.55
elig
-0.55
compatible
-0.55
POSITIVE LOGITS
selves
0.89
ideals
0.85
dignity
0.82
riors
0.80
sake
0.79
self
0.75
ankind
0.74
integrity
0.73
principles
0.73
ç¥ŀ
0.72
Activations Density 0.626%