INDEX
Explanations
uses the word "shrink" or variations of it
instances of the word "shrink" or its variations
New Auto-Interp
Negative Logits
coat
-0.81
iac
-0.81
abase
-0.73
oid
-0.71
bear
-0.70
gd
-0.67
trust
-0.67
Honest
-0.66
bah
-0.66
ibur
-0.66
POSITIVE LOGITS
shr
2.70
shr
0.88
meg
0.80
tom
0.79
ħĭ
0.76
bark
0.74
multipl
0.73
swall
0.72
halves
0.70
ãĤ©
0.69
Activations Density 0.001%