INDEX
Explanations
mentions of organizations, companies, or specific entities
proper nouns and news agency references
New Auto-Interp
Negative Logits
:\
-0.76
plet
-0.71
accur
-0.71
cuff
-0.70
capacities
-0.67
lear
-0.66
chairs
-0.66
aples
-0.64
ulously
-0.63
:]
-0.63
POSITIVE LOGITS
Morg
0.69
Ltd
0.68
meanwhile
0.66
Ħ¢
0.65
Mons
0.65
Laun
0.65
which
0.63
Avalon
0.63
Instance
0.62
Zionism
0.61
Activations Density 0.379%