INDEX
Explanations
instances of the apostrophe character used in possessive forms or contractions
New Auto-Interp
Negative Logits
a
-0.18
i
-0.17
à¸Ļ
-0.17
ا
-0.16
ity
-0.16
m
-0.16
er
-0.15
y
-0.15
icity
-0.15
en
-0.15
POSITIVE LOGITS
ibraltar
0.18
themselves
0.16
/*č↵
0.16
/'
0.16
uber
0.15
šk
0.15
ches
0.14
eting
0.14
ÑįÑĤомÑĥ
0.14
bsites
0.14
Activations Density 0.044%