INDEX
Explanations
references to news outlets, publications, and media programs
New Auto-Interp
Negative Logits
exting
-0.72
illance
-0.70
"]=>
-0.69
_-
-0.67
speech
-0.66
matter
-0.66
orthy
-0.65
isol
-0.65
Female
-0.64
ummies
-0.64
POSITIVE LOGITS
Artemis
0.94
USS
0.86
Garry
0.83
Hermes
0.83
Mond
0.82
Greenberg
0.81
Spart
0.81
Martha
0.80
Phi
0.79
HMS
0.79
Activations Density 0.227%