INDEX
Explanations
words reflecting strong emotional reactions or experiences
New Auto-Interp
Negative Logits
upa
-0.15
383
-0.15
undo
-0.15
ساÙĨ
-0.15
mastur
-0.15
bre
-0.14
ëĿ¼ëıĦ
-0.14
628
-0.14
596
-0.13
InParameter
-0.13
POSITIVE LOGITS
eki
0.15
Nob
0.15
oni
0.14
topLeft
0.14
obel
0.14
yle
0.14
sworth
0.13
Surprise
0.13
instead
0.13
Calibri
0.13
Activations Density 0.163%