INDEX
Explanations
phrases containing the word "survivor"
references to individuals who have survived traumatic experiences
New Auto-Interp
Negative Logits
ueller
-0.74
othy
-0.70
sterdam
-0.69
appa
-0.67
eanor
-0.67
istry
-0.67
okin
-0.66
ono
-0.66
ebus
-0.66
eton
-0.66
POSITIVE LOGITS
Survivors
1.06
survivors
0.98
Survive
0.86
Surv
0.85
testimonies
0.84
survivor
0.82
Survivor
0.82
Surviv
0.75
surv
0.74
ivable
0.72
Activations Density 0.017%