INDEX
Explanations
time-related information such as dates and times
New Auto-Interp
Negative Logits
arms
-0.60
expansion
-0.58
"{-0.57
Abrams
-0.57
Viz
-0.56
pudding
-0.55
ospace
-0.55
landsl
-0.55
pregnancy
-0.53
abuser
-0.53
POSITIVE LOGITS
00
1.44
30
1.39
05
1.38
09
1.38
08
1.37
06
1.37
04
1.36
07
1.35
59
1.35
02
1.31
Activations Density 0.780%