INDEX
Explanations
words related to national security
terms related to national security
New Auto-Interp
Negative Logits
sample
-0.76
amaz
-0.73
ken
-0.71
SHIP
-0.71
oven
-0.69
chrom
-0.66
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
-0.66
ãĥ£
-0.65
66666666
-0.65
bare
-0.64
POSITIVE LOGITS
Advisor
0.98
adviser
0.97
advisor
0.95
Adviser
0.92
olitics
0.88
policy
0.82
aceutical
0.81
counterterrorism
0.80
posture
0.79
planners
0.78
Activations Density 0.019%