INDEX
Explanations
occurrences of quotation marks in the text
New Auto-Interp
Negative Logits
кÑĢа
-0.08
eniz
-0.07
pokoj
-0.07
APPER
-0.07
ç±
-0.07
Ấ
-0.07
اضر
-0.07
ÌĨ
-0.07
Kinh
-0.07
idar
-0.07
POSITIVE LOGITS
Gal
0.08
s
0.07
ke
0.06
src
0.06
ula
0.06
Gilbert
0.06
Alto
0.06
GAL
0.06
ps
0.06
Gor
0.06
Activations Density 0.001%