INDEX
Explanations
conjunctions that introduce contrast or exceptions
New Auto-Interp
Negative Logits
zeÅĦ
-0.15
istrovstvÃŃ
-0.14
olie
-0.14
عب
-0.13
ocre
-0.13
mour
-0.13
crest
-0.13
.hw
-0.13
ekk
-0.13
пÑĢавило
-0.13
POSITIVE LOGITS
/or
0.15
âĤ¬“
0.14
ÑĢа
0.14
verts
0.14
atr
0.14
lem
0.13
chn
0.13
icles
0.13
radient
0.13
/OR
0.13
Activations Density 0.287%