INDEX
Explanations
references to individuals or groups involved in various activities or events
New Auto-Interp
Negative Logits
otrop
-0.69
waukee
-0.60
forcement
-0.60
situational
-0.58
slic
-0.58
herent
-0.57
otropic
-0.57
hazard
-0.57
umblr
-0.57
usage
-0.56
POSITIVE LOGITS
hips
1.08
include
1.07
consisted
0.98
are
0.95
were
0.93
hip
0.91
comprise
0.85
numbered
0.83
differed
0.82
consist
0.81
Activations Density 0.148%