INDEX
Explanations
mentions of the country Japan
references to Japan
New Auto-Interp
Negative Logits
sle
-0.75
ĪĴ
-0.66
wine
-0.66
uliffe
-0.65
Bran
-0.64
rals
-0.64
Weir
-0.63
leans
-0.63
̶
-0.62
0000000000000000
-0.60
POSITIVE LOGITS
Japan
3.65
Japan
3.33
Japanese
2.51
Tokyo
2.49
Japanese
2.40
Okinawa
2.14
Osaka
2.13
Korea
2.05
Taiwan
2.01
Tok
1.95
Activations Density 0.014%