INDEX
Explanations
phrases indicating possession or existence of something
New Auto-Interp
Negative Logits
immers
-0.17
cio
-0.17
ible
-0.15
hei
-0.15
vala
-0.15
ities
-0.15
hee
-0.15
imd
-0.14
alties
-0.14
laces
-0.14
POSITIVE LOGITS
options
0.21
choices
0.19
Choices
0.18
choice
0.16
Dud
0.15
options
0.15
Options
0.15
Wit
0.15
OPTIONS
0.15
option
0.15
Activations Density 0.182%