INDEX
Explanations
mentions of scars and scar-related terms
references to "scar" and its variations in context
New Auto-Interp
Negative Logits
ablishment
-0.72
ĨĴ
-0.70
hower
-0.68
ostics
-0.64
odcast
-0.63
£ı
-0.62
ullivan
-0.62
imates
-0.61
eers
-0.61
Luk
-0.61
POSITIVE LOGITS
red
1.16
ring
1.13
lets
1.06
lett
1.03
let
0.96
fed
0.96
crow
0.96
uler
0.95
ves
0.94
abs
0.92
Activations Density 0.042%