INDEX
Explanations
phrases related to unexpected developments or surprises
New Auto-Interp
Negative Logits
ufact
-0.73
ogl
-0.68
pta
-0.65
leased
-0.64
inez
-0.64
Ducks
-0.64
Domain
-0.63
isance
-0.63
league
-0.63
rahim
-0.62
POSITIVE LOGITS
twist
1.25
twists
1.16
Twist
0.87
twisting
0.85
weave
0.80
Whedon
0.80
endings
0.77
knot
0.73
hered
0.72
rope
0.71
Activations Density 0.022%