INDEX
Explanations
mentions of physical scars
occurrences of the word "scar" and its variations
New Auto-Interp
Negative Logits
ostics
-0.79
ĨĴ
-0.75
ullivan
-0.69
hower
-0.65
expectancy
-0.65
ablishment
-0.64
sterdam
-0.64
ablish
-0.62
£ı
-0.61
ADRA
-0.61
POSITIVE LOGITS
red
1.13
ring
1.09
lets
1.02
crow
0.96
fing
0.96
pered
0.91
abs
0.91
fed
0.91
nton
0.89
ves
0.88
Activations Density 0.043%