INDEX
Explanations
references to characters and their developments in TV shows
New Auto-Interp
Negative Logits
ettes
-0.16
ESIS
-0.16
atego
-0.15
ανά
-0.15
itis
-0.15
addCriterion
-0.14
afari
-0.14
afect
-0.14
vetica
-0.13
Buddha
-0.13
POSITIVE LOGITS
ancia
0.16
ATHER
0.14
202
0.14
commenter
0.14
pen
0.14
ίοÏĤ
0.14
omas
0.14
utow
0.14
ather
0.13
novel
0.13
Activations Density 0.016%