INDEX
Explanations
phrases related to comparison, evaluation, and critique
negative and positive descriptors related to events or situations
New Auto-Interp
Negative Logits
ADRA
-0.56
lett
-0.56
conflic
-0.56
roit
-0.55
blast
-0.54
unequ
-0.51
warr
-0.51
ascript
-0.50
Cannot
-0.49
liv
-0.49
POSITIVE LOGITS
is
1.33
are
1.09
was
1.01
involves
0.91
relates
0.89
revolves
0.84
is
0.84
lies
0.81
consists
0.77
include
0.76
Activations Density 0.674%