INDEX
Explanations
terms related to purpose or intention
New Auto-Interp
Negative Logits
ish
-0.18
å¯
-0.17
Ã¥n
-0.16
rec
-0.16
rael
-0.16
ishly
-0.16
sel
-0.15
orna
-0.15
eding
-0.15
agy
-0.15
POSITIVE LOGITS
ful
0.41
fully
0.40
fulness
0.33
FUL
0.31
-built
0.28
FULL
0.23
full
0.23
st
0.19
lessly
0.18
uilt
0.18
Activations Density 0.022%