INDEX
Explanations
names of individuals or entities involved in notable historical or political contexts
New Auto-Interp
Negative Logits
ίοÏĤ
-0.16
serie
-0.15
971
-0.15
961
-0.15
uristic
-0.15
ocache
-0.15
laid
-0.14
ead
-0.14
exclude
-0.14
enha
-0.14
POSITIVE LOGITS
ningen
0.38
heten
0.30
lingen
0.28
ingen
0.26
standen
0.24
isten
0.24
isen
0.24
ivet
0.23
ansen
0.23
ature
0.23
Activations Density 0.071%