INDEX
Explanations
proper names, especially those related to film and media
New Auto-Interp
Negative Logits
kova
-0.15
lob
-0.15
OCR
-0.15
.tt
-0.15
immers
-0.14
à¹Īาย
-0.14
apol
-0.14
.cr
-0.14
260
-0.14
aptcha
-0.14
POSITIVE LOGITS
chai
0.17
Sesso
0.16
zik
0.15
stery
0.14
æ¡Ī
0.14
Canceled
0.14
bathing
0.14
ãģĹãĤĩ
0.14
dressing
0.14
ä½ĵç³»
0.13
Activations Density 0.030%