INDEX
Explanations
unfamiliar symbols and formatting patterns
references to legal or controversial events involving individuals
New Auto-Interp
Negative Logits
oun
-0.88
revolving
-0.76
Clicker
-0.74
tremend
-0.74
omorphic
-0.73
eleph
-0.72
referen
-0.72
veter
-0.70
obser
-0.69
rolet
-0.69
POSITIVE LOGITS
Washington
0.93
Adds
0.82
Experts
0.79
Officials
0.78
Authorities
0.77
Government
0.75
They
0.75
Scientists
0.73
they
0.73
Critics
0.70
Activations Density 0.057%