INDEX
Explanations
actions and interactions between people
New Auto-Interp
Negative Logits
amaz
-0.76
tained
-0.70
Been
-0.66
pired
-0.66
ceivable
-0.65
umbn
-0.65
printed
-0.63
Þ
-0.62
Powered
-0.62
pared
-0.61
POSITIVE LOGITS
consequently
1.10
reap
1.07
vice
1.05
manipulate
1.04
sometimes
1.04
thus
1.03
therefore
1.03
thereby
0.98
communicate
0.98
behave
0.97
Activations Density 0.309%