INDEX
Explanations
references to accountability and regulation within organizations
New Auto-Interp
Negative Logits
اÙĦÙĤÙĬ
-0.16
sovere
-0.15
stabilization
-0.15
otion
-0.14
Transmission
-0.13
ousse
-0.13
conting
-0.13
ota
-0.13
ghi
-0.13
otional
-0.13
POSITIVE LOGITS
complaint
0.32
independent
0.30
Complaint
0.29
watchdog
0.28
complaints
0.28
independ
0.27
çĭ¬ç«ĭ
0.27
Independent
0.26
independence
0.26
independently
0.25
Activations Density 0.117%