INDEX
Explanations
references to specific places or individuals in China
New Auto-Interp
Negative Logits
LookAnd
-0.42
Kenyatta
-0.40
Slovenian
-0.38
hummus
-0.37
Slovak
-0.36
raeli
-0.35
mayonnaise
-0.35
Dilution
-0.35
Reykjavik
-0.35
Luton
-0.34
POSITIVE LOGITS
Emperor
0.67
imperial
0.65
Conf
0.64
Confucian
0.64
Emperor
0.63
scholar
0.62
Imperial
0.62
jade
0.59
Tao
0.59
scholar
0.58
Activations Density 0.257%