INDEX
Explanations
political figures and government institutions
references to political entities and individuals in various contexts
New Auto-Interp
Negative Logits
disappeared
-0.83
vanished
-0.74
died
-0.71
replaced
-0.70
Joined
-0.62
};
-0.62
PLUS
-0.62
crashed
-0.62
disappears
-0.61
inactive
-0.60
POSITIVE LOGITS
erning
0.96
policymakers
0.84
antics
0.79
thinking
0.76
wondering
0.76
contemplating
0.75
puzzling
0.74
understandably
0.74
hesitate
0.72
puzz
0.70
Activations Density 0.505%