INDEX
Explanations
references to the city of Tel Aviv
New Auto-Interp
Negative Logits
ÅĻez
-0.17
ãĥ¼ãĥ¬
-0.16
Mort
-0.16
PLIER
-0.15
leo
-0.15
yonel
-0.14
geois
-0.14
ledger
-0.14
iams
-0.14
driver
-0.14
POSITIVE LOGITS
Aviv
0.31
éfono
0.23
ugu
0.22
stra
0.20
kup
0.19
cord
0.18
cos
0.17
gte
0.16
lico
0.16
cel
0.16
Activations Density 0.007%