INDEX
Explanations
instances of high-stakes situations or crises
New Auto-Interp
Negative Logits
OTHER
-0.61
col
-0.59
CO
-0.59
QB
-0.58
OND
-0.58
INAL
-0.57
UNCLASSIFIED
-0.57
stead
-0.56
Illegal
-0.56
Ult
-0.56
POSITIVE LOGITS
they
0.79
let
0.78
according
0.77
we
0.76
relying
0.76
however
0.71
focusing
0.70
it
0.69
suffice
0.69
focus
0.68
Activations Density 0.040%