INDEX
Explanations
words related to strong connections or inevitable outcomes
instances of the word "bound."
New Auto-Interp
Negative Logits
issance
-0.89
sych
-0.77
soDeliveryDate
-0.71
ciation
-0.70
fielded
-0.66
HAEL
-0.65
STEM
-0.65
ETHOD
-0.64
amera
-0.64
oples
-0.63
POSITIVE LOGITS
less
0.98
fold
0.96
bound
0.96
lessly
0.95
unin
0.93
binding
0.90
bound
0.88
ging
0.86
sym
0.86
gling
0.82
Activations Density 0.021%