INDEX
Explanations
phrases indicating transformation or change in a state or condition
New Auto-Interp
Negative Logits
noop
-0.18
umba
-0.15
itics
-0.15
recent
-0.14
older
-0.14
çİ
-0.14
.hm
-0.14
previous
-0.14
¯
-0.14
ее
-0.14
POSITIVE LOGITS
full
0.36
fully
0.25
bona
0.25
(full
0.25
full
0.24
actual
0.24
/full
0.24
something
0.23
mini
0.23
_full
0.22
Activations Density 0.249%