INDEX
Explanations
words or phrases related to making an argument or presenting a point
New Auto-Interp
Negative Logits
ties
-0.74
sed
-0.69
vim
-0.68
walking
-0.68
pher
-0.66
bos
-0.64
dies
-0.64
sav
-0.63
nz
-0.61
retched
-0.59
POSITIVE LOGITS
responders
1.06
impressions
0.91
baseman
0.83
Steps
0.79
glance
0.76
Respond
0.73
ancest
0.71
Corinthians
0.70
Published
0.70
Name
0.69
Activations Density 0.041%