INDEX
Explanations
phrases related to actions or intentions
expressions indicating actions and experiences
New Auto-Interp
Negative Logits
ierrez
-0.71
hero
-0.69
artisan
-0.62
inel
-0.61
aston
-0.60
abad
-0.60
zar
-0.59
agraph
-0.58
pend
-0.56
Dickinson
-0.56
POSITIVE LOGITS
entails
0.74
Ca
0.74
Learned
0.73
Doing
0.72
ãĤ´
0.71
happening
0.70
fuss
0.68
pires
0.67
doing
0.67
done
0.66
Activations Density 0.288%