INDEX
Explanations
phrases that indicate achievements or capabilities
New Auto-Interp
Negative Logits
ierrez
-0.69
Dickinson
-0.68
orney
-0.65
Refresh
-0.61
hero
-0.60
voy
-0.60
Roose
-0.60
ingham
-0.59
Stephenson
-0.59
mayors
-0.59
POSITIVE LOGITS
termed
0.84
Downloadha
0.84
wrought
0.80
accomplished
0.79
Learned
0.75
happening
0.72
indu
0.71
perceived
0.71
ãĤ´
0.70
done
0.69
Activations Density 0.118%