INDEX
Explanations
mentions of institutional settings or practices
references to institutional concepts or systems
New Auto-Interp
Negative Logits
vous
-0.91
bane
-0.77
nen
-0.77
ky
-0.76
hner
-0.75
ragon
-0.75
spell
-0.75
kers
-0.75
vich
-0.74
word
-0.73
POSITIVE LOGITS
ized
1.29
ization
1.21
ised
1.10
itutional
1.08
izes
1.01
izational
0.98
isation
0.98
izing
0.97
izations
0.96
institutions
0.96
Activations Density 0.015%