INDEX
Explanations
phrases related to societal issues and power dynamics
New Auto-Interp
Negative Logits
ancies
-0.70
surveys
-0.65
assumptions
-0.64
interviews
-0.62
petitions
-0.62
ceilings
-0.62
tails
-0.60
icts
-0.60
twists
-0.60
pregnancies
-0.60
POSITIVE LOGITS
unto
1.14
extraord
0.88
fodder
0.86
worthy
0.82
conduit
0.81
unworthy
0.76
discipl
0.72
deserving
0.72
keeper
0.70
incarn
0.70
Activations Density 1.896%