INDEX
Explanations
words related to intense actions or states of being
New Auto-Interp
Negative Logits
abor
-0.17
arium
-0.17
aman
-0.16
fu
-0.16
onen
-0.16
abil
-0.16
ensi
-0.15
fa
-0.15
aki
-0.15
él
-0.15
POSITIVE LOGITS
ught
0.26
UGHT
0.23
ughter
0.22
INT
0.22
int
0.21
unch
0.21
ints
0.21
ìŀħ
0.20
unting
0.20
Ñĥнд
0.20
Activations Density 0.051%