INDEX
Explanations
verbs and actions related to choices and consequences
New Auto-Interp
Negative Logits
FANT
-0.60
ixties
-0.60
swell
-0.59
understands
-0.58
perv
-0.57
roma
-0.56
(%
-0.54
Bruins
-0.54
beh
-0.53
rightly
-0.53
POSITIVE LOGITS
altogether
0.99
oneself
0.85
irtual
0.79
gans
0.78
quished
0.78
yourself
0.78
versa
0.75
ivably
0.74
outright
0.73
éŃĶ
0.72
Activations Density 0.145%