INDEX
Explanations
references to the snack food "Doritos"
mentions of the name "Dor."
New Auto-Interp
Negative Logits
anwhile
-0.89
WAYS
-0.75
Terrorism
-0.70
unct
-0.70
tenance
-0.65
xual
-0.64
pmwiki
-0.64
BRE
-0.62
HS
-0.62
PATH
-0.61
POSITIVE LOGITS
ÃŃa
0.95
ado
0.95
cas
0.93
je
0.92
oshenko
0.92
iane
0.90
rell
0.88
iac
0.87
rance
0.85
acies
0.85
Activations Density 0.008%