INDEX
Explanations
phrases related to political and economic discussions
concepts related to social issues and political discussions
New Auto-Interp
Negative Logits
çͰ
-0.70
Joined
-0.63
largeDownload
-0.60
wegian
-0.58
yssey
-0.57
ONDON
-0.55
âĵĺ
-0.55
inguished
-0.55
UGC
-0.53
Logged
-0.53
POSITIVE LOGITS
or
0.70
anymore
0.63
nor
0.57
udic
0.54
bribe
0.54
oneself
0.54
someday
0.54
scapego
0.53
grain
0.50
quo
0.49
Activations Density 2.055%