INDEX
Explanations
conditional phrases indicating potential actions or outcomes
New Auto-Interp
Negative Logits
ised
-0.17
sted
-0.17
izedName
-0.16
.scalablytyped
-0.15
arius
-0.15
contres
-0.15
arel
-0.15
yr
-0.15
ized
-0.14
.bunifuFlatButton
-0.14
POSITIVE LOGITS
nt
0.35
NT
0.22
potentially
0.22
be
0.21
conce
0.20
possibly
0.20
've
0.19
’ve
0.19
ÂŃn
0.18
/w
0.17
Activations Density 0.087%