INDEX
Explanations
the word "instead."
New Auto-Interp
Negative Logits
囗
-0.58
Портал
-0.56
columnHeader
-0.54
Wolverine
-0.53
Juri
-0.53
Fuzzy
-0.53
lèvres
-0.52
Dermal
-0.52
Juri
-0.52
Giới
-0.52
POSITIVE LOGITS
instead
1.00
instead
0.94
Instead
0.91
Instead
0.88
vece
0.74
betweenstory
0.60
вместо
0.59
zamiast
0.54
stead
0.53
mtd
0.52
Activations Density 0.011%