INDEX
Explanations
the beginnings of lists or enumerated points
phrases that introduce numbered lists or points
New Auto-Interp
Negative Logits
orian
-0.77
urated
-0.69
orate
-0.62
estern
-0.62
awaits
-0.58
idia
-0.58
uded
-0.57
atsu
-0.56
ascus
-0.56
azel
-0.56
POSITIVE LOGITS
Firstly
1.83
Firstly
1.71
First
1.52
First
1.44
first
1.26
first
1.17
1
1.12
FIRST
1.10
1
1.05
Number
0.94
Activations Density 0.311%