INDEX
Explanations
phrases related to classified information or sensitive topics
terms related to secrecy and security
New Auto-Interp
Negative Logits
Natural
-0.70
Weston
-0.68
SHIP
-0.63
Photographer
-0.59
Japanese
-0.56
Lomb
-0.55
hess
-0.55
ciplinary
-0.55
));
-0.55
Holden
-0.54
POSITIVE LOGITS
»
1.13
[/
1.00
ãĢı
0.99
ãĢį
0.97
\)
0.94
''
0.92
,''
0.89
_.
0.86
ãĢij
0.85
`,
0.84
Activations Density 0.855%