INDEX
Explanations
references to various news publications, particularly the New York Times
New Auto-Interp
Negative Logits
IFO
-0.16
$MESS
-0.16
RL
-0.15
ogens
-0.15
uria
-0.15
हन
-0.14
Holl
-0.14
hann
-0.14
.Debugf
-0.14
keleton
-0.14
POSITIVE LOGITS
.ny
0.28
ny
0.28
NY
0.27
NYT
0.27
Times
0.26
NY
0.26
Times
0.26
ny
0.23
Ny
0.20
New
0.19
Activations Density 0.062%