INDEX
Explanations
action verbs describing processes
New Auto-Interp
Negative Logits
longstanding
1.24
sensational
1.19
thei
1.15
bureaucratic
1.14
strikingly
1.13
tinkering
1.12
suburban
1.12
wildly
1.12
legitimate
1.11
flashy
1.10
POSITIVE LOGITS
H
0.80
IC
0.79
Y
0.73
O
0.73
E
0.72
I
0.70
C
0.69
P
0.68
F
0.68
Ac
0.67
Activations Density 0.138%