INDEX
Explanations
words related to survival
New Auto-Interp
Negative Logits
quart
-0.75
puter
-0.74
kson
-0.70
entious
-0.69
Anthem
-0.67
eneg
-0.66
endo
-0.66
estic
-0.64
umin
-0.64
FG
-0.64
POSITIVE LOGITS
instincts
1.02
ously
0.98
survival
0.89
arily
0.89
instinct
0.88
Survive
0.79
ist
0.76
ists
0.75
Survival
0.72
deterrent
0.71
Activations Density 0.023%