INDEX
Explanations
acronyms and proper nouns related to organizations or political entities
references to organizations or entities involved in political or governmental contexts
New Auto-Interp
Negative Logits
chest
-0.98
lets
-0.90
stocks
-0.78
gery
-0.76
gers
-0.75
Strait
-0.74
geant
-0.73
strap
-0.72
ger
-0.71
bull
-0.71
POSITIVE LOGITS
NEC
1.06
ESS
1.01
ODE
0.98
ISION
0.95
IZ
0.94
ORN
0.92
OUN
0.89
ORD
0.88
IVES
0.88
reluct
0.86
Activations Density 0.022%