INDEX
Explanations
references to physical posters and related printed materials
New Auto-Interp
Negative Logits
sons
-0.17
vil
-0.16
achten
-0.15
ationally
-0.15
xdf
-0.15
van
-0.15
ussen
-0.14
ëĿ½
-0.14
von
-0.14
nal
-0.14
POSITIVE LOGITS
iors
0.20
Poster
0.20
iore
0.19
ural
0.19
poster
0.17
posters
0.17
hyth
0.16
Poster
0.15
iat
0.15
URAL
0.15
Activations Density 0.009%