INDEX
Explanations
references to nighttime or activities associated with the night
references to nighttime or events occurring at night
New Auto-Interp
Negative Logits
pta
-0.72
achev
-0.72
emort
-0.71
cules
-0.70
ignty
-0.67
berman
-0.67
elsen
-0.67
00000000
-0.66
rompt
-0.66
xon
-0.66
POSITIVE LOGITS
fall
1.08
mar
1.04
cap
1.02
life
0.97
light
0.95
mares
0.87
night
0.85
urnal
0.81
stand
0.80
sky
0.78
Activations Density 0.031%