INDEX
Explanations
phrases related to political pathways and strategies
New Auto-Interp
Negative Logits
ividual
-0.75
teness
-0.71
coh
-0.70
ombies
-0.67
contrace
-0.66
glomer
-0.66
Sting
-0.66
mble
-0.66
obser
-0.64
arus
-0.62
POSITIVE LOGITS
finding
0.98
paths
0.95
toward
0.93
breaking
0.92
towards
0.90
forward
0.89
path
0.89
ologies
0.87
finder
0.86
path
0.86
Activations Density 0.028%