INDEX
Explanations
references to historical political changes and events, particularly related to the Soviet Union and its dissolution
New Auto-Interp
Negative Logits
ctor
-0.14
.synthetic
-0.14
807
-0.14
teri
-0.14
CI
-0.14
atter
-0.14
.constructor
-0.14
azzi
-0.13
매
-0.13
erb
-0.13
POSITIVE LOGITS
transition
0.20
freedom
0.19
-transition
0.18
Transition
0.18
peace
0.18
transitional
0.17
free
0.17
Freedom
0.17
Peace
0.17
freedoms
0.17
Activations Density 0.163%