INDEX
Explanations
phrases indicating a strong position or opinion on a subject
phrases that indicate expression or communication of thoughts and ideas
New Auto-Interp
Negative Logits
riger
-0.73
nce
-0.65
croft
-0.63
irection
-0.61
authorized
-0.60
invaders
-0.59
washer
-0.58
itary
-0.58
rift
-0.58
cephal
-0.58
POSITIVE LOGITS
offer
0.94
contribute
0.93
prove
0.88
learn
0.88
contend
0.86
say
0.85
teach
0.81
apologize
0.81
impart
0.80
SAY
0.76
Activations Density 0.050%