INDEX
Explanations
specific pronouns and verbs indicating action or involvement
New Auto-Interp
Negative Logits
agar
-0.17
afi
-0.15
olt
-0.15
quete
-0.14
inding
-0.14
بت
-0.14
lod
-0.14
asti
-0.13
áºł
-0.13
ØŃÙĦ
-0.13
POSITIVE LOGITS
heiro
0.15
ekyll
0.15
UNUSED
0.15
ocomplete
0.14
uent
0.14
stup
0.14
.Abstractions
0.14
/gpl
0.14
Walsh
0.14
uentes
0.14
Activations Density 0.000%