INDEX
Explanations
references to entertainment-related entities and content
uncommon proper nouns (people, places, or organizations), especially capitalized names within text.
New Auto-Interp
Negative Logits
inium
-0.15
arden
-0.15
oise
-0.15
ewis
-0.14
.rad
-0.14
chet
-0.14
453
-0.14
79
-0.14
rts
-0.14
879
-0.14
POSITIVE LOGITS
778
0.15
ToFit
0.15
///<
0.14
subst
0.14
ya
0.14
ocker
0.14
aph
0.14
à¹Īำ
0.13
lse
0.13
ROUT
0.13
Activations Density 0.324%