INDEX
Explanations
references to time and decision-making
New Auto-Interp
Negative Logits
AndEndTag
-0.69
-0.66
myſelf
-0.60
PerformLayout
-0.60
DockStyle
-0.60
ſelves
-0.59
itſelf
-0.57
kloped
-0.55
pleaſure
-0.53
themſelves
-0.53
POSITIVE LOGITS
time
2.67
time
2.06
Time
2.00
Time
1.92
TIME
1.88
TIME
1.62
tijd
1.53
时间
1.41
czasu
1.36
tiempo
1.34
Activations Density 0.130%