INDEX
Explanations
mentions of the city Tokyo
references to Tokyo
New Auto-Interp
Negative Logits
hemy
-0.78
inelli
-0.74
estern
-0.73
ebook
-0.70
edly
-0.69
theless
-0.68
mble
-0.68
onies
-0.68
ibilities
-0.68
rals
-0.67
POSITIVE LOGITS
Babel
0.81
Dome
0.80
Lumpur
0.77
Gh
0.74
Xan
0.71
ichi
0.70
Tok
0.68
Bay
0.67
iji
0.66
jin
0.65
Activations Density 0.011%