INDEX
Explanations
phrases related to political events or places
New Auto-Interp
Negative Logits
BALL
-0.70
wagon
-0.67
DEV
-0.67
WARN
-0.64
Plex
-0.62
Cass
-0.61
log
-0.61
spin
-0.60
Plugin
-0.59
yne
-0.58
POSITIVE LOGITS
Hong
3.78
Hong
3.47
Guang
2.22
HK
2.10
Shanghai
1.91
Singapore
1.91
Taiwan
1.82
Beijing
1.78
Tian
1.77
Taiwanese
1.69
Activations Density 0.019%