INDEX
Explanations
like-minded individuals or groups
references to individuals or groups with similar beliefs or values
New Auto-Interp
Negative Logits
adra
-0.67
atro
-0.65
IDS
-0.65
ILA
-0.64
avez
-0.62
adium
-0.61
ç«
-0.61
atin
-0.61
AGE
-0.59
morphine
-0.59
POSITIVE LOGITS
edly
1.18
minded
1.12
ness
1.06
minded
1.05
llor
0.91
quartered
0.90
edIn
0.87
eering
0.83
emouth
0.83
iciary
0.81
Activations Density 0.012%