INDEX
Explanations
Chinese history and Confucianism
New Auto-Interp
Negative Logits
輥
0.67
蜞
0.66
狨
0.66
Umsatz
0.66
猞
0.66
銠
0.64
瀘
0.64
莳
0.64
鸰
0.64
蜢
0.63
POSITIVE LOGITS
Chinese
0.77
Confucian
0.74
chinese
0.68
중국
0.65
China
0.64
Dao
0.63
китай
0.59
wei
0.58
Shang
0.57
bamboo
0.57
Activations Density 0.031%