INDEX
Explanations
proper nouns, especially names of people and places
expressions of familial relationships and personal connections
New Auto-Interp
Negative Logits
Briggs
-0.95
Simmons
-0.88
Slate
-0.85
Summers
-0.85
Whedon
-0.84
Rollins
-0.81
Scott
-0.81
Randall
-0.80
McKay
-0.80
Ohio
-0.78
POSITIVE LOGITS
fulfil
0.93
unlaw
0.92
Daesh
0.91
Tanz
0.91
arij
0.91
Malays
0.88
)",
0.87
ijn
0.86
Juda
0.86
Tayyip
0.84
Activations Density 1.759%