INDEX
Explanations
repetitive use of the word "and" in varying contexts
New Auto-Interp
Negative Logits
chaus
-0.36
rather
-0.35
┘
-0.32
boî
-0.32
一代
-0.32
.
-0.31
the
-0.31
GTCX
-0.30
via
-0.30
my
-0.30
POSITIVE LOGITS
Anſ
0.60
betweenstory
0.59
ſind
0.56
typelib
0.54
Carthage
0.54
purpoſe
0.54
Beſ
0.53
ſame
0.52
mpagne
0.52
ロウィン
0.52
Activations Density 0.052%