INDEX
Explanations
phrases related to criticism or evaluation of people or situations
New Auto-Interp
Negative Logits
PsyNetMessage
-0.62
Stri
-0.62
ufact
-0.61
Introduced
-0.60
riad
-0.59
arius
-0.58
aret
-0.58
ItemTracker
-0.58
fielded
-0.58
Surv
-0.57
POSITIVE LOGITS
vind
0.82
louder
0.66
zynski
0.65
eworthy
0.63
ENC
0.63
smarter
0.62
="#
0.60
ignore
0.59
unwelcome
0.59
Agent
0.58
Activations Density 12.324%