INDEX
Explanations
proper nouns related to news and politics
names of individuals and entities involved in news contexts
New Auto-Interp
Negative Logits
multic
-0.69
constants
-0.65
collisions
-0.64
conventions
-0.62
agons
-0.61
acebook
-0.60
wallets
-0.60
polarized
-0.59
snag
-0.59
squirrel
-0.59
POSITIVE LOGITS
ahu
1.04
angan
0.94
iev
0.89
gui
0.88
ani
0.87
ova
0.86
pta
0.86
oglu
0.86
ouf
0.86
ÄŁ
0.85
Activations Density 0.477%