INDEX
Explanations
references to a specific name 'Dorner'
references to an individual named Dor
New Auto-Interp
Negative Logits
anwhile
-0.95
unct
-0.72
WAYS
-0.71
Terrorism
-0.69
heet
-0.66
HuffPost
-0.65
BRE
-0.65
tenance
-0.63
Wanted
-0.63
unction
-0.63
POSITIVE LOGITS
ÃŃa
0.93
iane
0.93
je
0.91
ado
0.90
Dor
0.90
cas
0.88
oshenko
0.86
chester
0.85
rance
0.84
idad
0.82
Activations Density 0.004%