INDEX
Explanations
repeated phrases highlighting familiarity or reputation regarding people, places, or brands
New Auto-Interp
Negative Logits
ilan
-0.17
ertz
-0.17
toa
-0.15
stk
-0.14
Yours
-0.14
qi
-0.14
icter
-0.14
afs
-0.14
offset
-0.14
arking
-0.14
POSITIVE LOGITS
for
0.32
for
0.24
for
0.24
wegen
0.23
dafür
0.21
за
0.20
για
0.20
สำหร
0.20
für
0.18
because
0.18
Activations Density 0.048%