INDEX
Explanations
phrases related to beliefs, thoughts, and judgments
phrases indicating judgments, conclusions, or beliefs about situations
New Auto-Interp
Negative Logits
Pg
-0.79
vet
-0.71
xtap
-0.69
unal
-0.68
arse
-0.64
ivalent
-0.64
voice
-0.64
izons
-0.64
appropriately
-0.64
depth
-0.64
POSITIVE LOGITS
THEY
1.09
they
0.95
there
0.91
THERE
0.87
that
0.79
nobody
0.76
we
0.74
thats
0.74
THAT
0.71
SHE
0.68
Activations Density 0.505%