INDEX
Explanations
words related to significant changes or shifts in thinking or perspective
concepts related to shifts in paradigms or fundamental changes in systems
New Auto-Interp
Negative Logits
itarian
-0.74
é¾
-0.72
llah
-0.72
sports
-0.69
Ñĭ
-0.67
itals
-0.67
DISTRICT
-0.67
IENCE
-0.67
Cosponsors
-0.66
Murd
-0.65
POSITIVE LOGITS
avior
0.88
velop
0.81
avi
0.80
OPLE
0.74
anger
0.71
rums
0.68
mann
0.68
lers
0.68
Í
0.67
rop
0.67
Activations Density 0.115%