INDEX
Explanations
political news and events
New Auto-Interp
Negative Logits
rhy
-0.86
provocation
-0.74
intrinsic
-0.73
tolerance
-0.72
manuals
-0.71
structure
-0.71
timet
-0.70
vocabulary
-0.69
ransom
-0.69
smugglers
-0.69
POSITIVE LOGITS
Calif
1.23
California
1.19
Virginia
1.17
Michigan
1.16
Florida
1.16
Minnesota
1.15
Wisconsin
1.15
Seattle
1.12
Portland
1.11
Texas
1.11
Activations Density 0.029%