INDEX
Explanations
expressions relating to familial relationships and emotional distress
New Auto-Interp
Negative Logits
onaut
-0.16
Operation
-0.15
otos
-0.15
reib
-0.14
Ñģол
-0.14
Team
-0.14
icc
-0.14
pedo
-0.14
'gc
-0.13
Special
-0.13
POSITIVE LOGITS
society
0.18
Wealth
0.17
societal
0.17
Rochester
0.16
wealth
0.16
socioeconomic
0.15
Scarborough
0.15
societies
0.14
Indies
0.14
Infer
0.14
Activations Density 0.001%