INDEX
Explanations
words related to the New York Times
the occurrence of the substring "ny"
New Auto-Interp
Negative Logits
Reviewed
-0.79
PKK
-0.75
EMP
-0.70
rador
-0.69
ACTED
-0.69
ENDED
-0.64
slave
-0.63
Crus
-0.63
COMPLE
-0.60
forfeiture
-0.60
POSITIVE LOGITS
ny
1.34
mph
1.00
Giuliani
0.87
theless
0.83
Mellon
0.81
mbol
0.79
NY
0.78
acht
0.77
heter
0.77
wallet
0.75
Activations Density 0.006%