INDEX
Explanations
time schedules
time-related information
New Auto-Interp
Negative Logits
revolt
-0.71
infl
-0.67
corrections
-0.66
bloom
-0.65
outweigh
-0.63
inately
-0.63
prolifer
-0.63
insurrection
-0.63
unfl
-0.62
numer
-0.62
POSITIVE LOGITS
00
1.92
30
1.79
59
1.61
45
1.61
15
1.40
05
1.35
55
1.35
50
1.28
40
1.27
01
1.26
Activations Density 0.030%