INDEX
Explanations
phrases and expressions that reflect common sayings or idiomatic language
New Auto-Interp
Negative Logits
atab
-0.17
ween
-0.16
aney
-0.16
isman
-0.15
woman
-0.15
d
-0.15
Wonder
-0.14
aran
-0.14
plain
-0.14
acob
-0.14
POSITIVE LOGITS
OGLE
0.17
bands
0.15
ValuePair
0.15
oire
0.15
ÑĮко
0.15
isco
0.15
ÑĥкÑĤ
0.15
èªī
0.14
ableObject
0.14
ibus
0.14
Activations Density 0.030%