INDEX
Explanations
punctuation marks, particularly periods and question marks
New Auto-Interp
Negative Logits
kid
-0.14
sock
-0.14
Advance
-0.14
Gatt
-0.14
ÙĤدر
-0.14
adel
-0.13
Ced
-0.13
ÅĻad
-0.13
Wid
-0.13
Advance
-0.13
POSITIVE LOGITS
voice
0.19
Narr
0.16
Narr
0.16
Voice
0.15
омеÑĢ
0.14
oice
0.14
Hast
0.14
voice
0.14
Voice
0.13
912
0.13
Activations Density 0.093%