INDEX
Explanations
terms related to national security
references to national security
New Auto-Interp
Negative Logits
chrom
-0.80
amaz
-0.78
Blocks
-0.77
oven
-0.76
ken
-0.74
Refresh
-0.74
aways
-0.73
Torrent
-0.70
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
-0.69
Niet
-0.68
POSITIVE LOGITS
adviser
1.05
advisor
1.03
Advisor
0.86
apparatus
0.85
Adviser
0.83
policy
0.83
implications
0.81
olitics
0.80
Agency
0.80
theorist
0.79
Activations Density 0.021%