INDEX
Explanations
patterns of change and stability in relationships and societal norms
New Auto-Interp
Negative Logits
ucz
-0.15
apur
-0.15
onso
-0.15
erto
-0.14
uptools
-0.14
uebas
-0.14
++)
-0.14
rita
-0.13
ế
-0.13
173
-0.13
POSITIVE LOGITS
change
0.96
change
0.85
Change
0.82
changed
0.82
-change
0.81
Change
0.79
changing
0.78
changes
0.77
CHANGE
0.75
_change
0.74
Activations Density 0.397%