INDEX
Explanations
modal verbs indicating possibility or certainty
New Auto-Interp
Negative Logits
themselves
-0.18
Winds
-0.16
istas
-0.16
hn
-0.16
agn
-0.15
omat
-0.14
ibil
-0.14
Fate
-0.14
578
-0.14
694
-0.14
POSITIVE LOGITS
raining
0.29
chy
0.23
iner
0.22
rain
0.19
eless
0.18
edn
0.18
alo
0.18
alic
0.17
ty
0.17
happen
0.17
Activations Density 0.185%