INDEX
Explanations
specific time references such as days, months, years, and weeks
temporal references related to time intervals and dates
New Auto-Interp
Negative Logits
toughness
-0.61
mastery
-0.57
love
-0.55
pires
-0.53
masculinity
-0.53
instincts
-0.53
elong
-0.53
paradise
-0.52
loves
-0.52
Influence
-0.51
POSITIVE LOGITS
.
1.06
.[
1.06
.(
0.96
.]
0.96
.*
0.93
\.
0.92
.}
0.91
*.
0.89
ãĢĤ
0.88
.�
0.87
Activations Density 0.263%