INDEX
Explanations
references to achievements and merits in historical or cultural contexts
New Auto-Interp
Negative Logits
olon
-0.18
ürk
-0.14
SWG
-0.14
elerik
-0.14
Wash
-0.14
igmoid
-0.14
.NewRequest
-0.13
fucked
-0.13
oten
-0.13
Kob
-0.13
POSITIVE LOGITS
Chest
0.29
Thom
0.27
Aqu
0.26
Newman
0.26
Catholic
0.25
Aqu
0.25
Catholics
0.23
subsidi
0.22
Orth
0.21
Distrib
0.20
Activations Density 0.027%