INDEX
Explanations
geographical locations and proper nouns associated with Italy
New Auto-Interp
Negative Logits
_INTR
-0.17
ullivan
-0.16
uci
-0.15
пеÑĢеда
-0.15
flo
-0.14
orne
-0.14
Alic
-0.14
ugas
-0.14
Giles
-0.14
cak
-0.14
POSITIVE LOGITS
Brian
0.28
Brian
0.25
Pied
0.23
Tic
0.22
Pi
0.22
Berg
0.21
Lomb
0.20
Milan
0.20
Po
0.19
Lang
0.19
Activations Density 0.020%