INDEX
Explanations
references to alternative options or choices
terms related to alternatives and alternative perspectives
New Auto-Interp
Negative Logits
older
-0.90
oho
-0.71
gaard
-0.70
reditary
-0.68
owler
-0.67
iencies
-0.66
Bees
-0.65
midt
-0.64
cedented
-0.64
hips
-0.63
POSITIVE LOGITS
lifestyles
1.08
viewpoints
1.02
explanations
0.92
viewpoint
0.89
routes
0.86
universes
0.83
modes
0.82
interpretations
0.82
route
0.81
perspectives
0.81
Activations Density 0.048%