INDEX
Explanations
references to individuals and their achievements or roles in society
New Auto-Interp
Negative Logits
oven
-0.17
ardin
-0.17
haps
-0.16
agrid
-0.14
fund
-0.14
fund
-0.14
ough
-0.14
/misc
-0.14
Kunst
-0.13
iena
-0.13
POSITIVE LOGITS
iren
0.16
olute
0.15
ÅĻ
0.15
chia
0.14
ethyl
0.14
anter
0.13
carr
0.13
pragma
0.13
GM
0.13
ires
0.13
Activations Density 0.142%