INDEX
Explanations
names of buildings or historical places
references to specific palaces or significant historical locations
New Auto-Interp
Negative Logits
CAST
-0.73
ELL
-0.70
VOL
-0.70
::::::::
-0.69
err
-0.67
ivity
-0.67
arette
-0.66
REE
-0.65
CHAT
-0.64
Braun
-0.64
POSITIVE LOGITS
Palace
1.00
ibur
0.86
osaurs
0.81
palace
0.80
gur
0.76
maiden
0.76
monary
0.75
itiz
0.75
Hotel
0.74
OTUS
0.72
Activations Density 0.011%