INDEX
Explanations
terms related to Japanese names
references to specific Japanese names
New Auto-Interp
Negative Logits
hillary
-0.71
igree
-0.68
isters
-0.63
dress
-0.63
ggies
-0.62
comb
-0.61
ours
-0.61
ser
-0.60
skirts
-0.60
hope
-0.59
POSITIVE LOGITS
ichi
1.49
omi
0.90
imaru
0.89
aku
0.87
atsu
0.87
terness
0.85
uchi
0.84
iba
0.84
hei
0.84
ãĥ³ãĤ¸
0.83
Activations Density 0.005%