INDEX
Explanations
high-frequency auxiliary verbs and conjunctions
New Auto-Interp
Negative Logits
owi
-0.17
ض
-0.16
moh
-0.15
582
-0.15
fol
-0.15
θν
-0.15
Moh
-0.14
halb
-0.14
.completed
-0.14
ertz
-0.14
POSITIVE LOGITS
ighth
0.16
porr
0.15
resas
0.15
pekt
0.14
circles
0.14
ixon
0.14
RL
0.14
èįī
0.14
å¥
0.13
OWN
0.13
Activations Density 0.000%