INDEX
Explanations
variations of the word "ha" which indicates laughter or amusement
New Auto-Interp
Negative Logits
mente
-0.17
ment
-0.16
IGN
-0.16
tab
-0.15
mür
-0.15
tent
-0.14
Writes
-0.14
canh
-0.14
uyu
-0.14
gli
-0.14
POSITIVE LOGITS
ifa
0.19
ichen
0.17
lett
0.17
ould
0.15
resi
0.15
urette
0.15
ERIC
0.14
witness
0.14
idot
0.14
CKER
0.14
Activations Density 0.018%