INDEX
Explanations
exclamatory phrases and strong emotional expressions
New Auto-Interp
Negative Logits
-0.21
nt
-0.17
ful
-0.15
inya
-0.15
ite
-0.14
iny
-0.14
teenth
-0.14
friend
-0.14
aken
-0.14
ismet
-0.13
POSITIVE LOGITS
fsp
0.16
etes
0.16
aired
0.15
ÙĬ
0.15
ÛĮ
0.14
ennent
0.14
inem
0.14
ootball
0.14
ilere
0.14
urnal
0.14
Activations Density 0.039%