INDEX
Explanations
instances of the word "sometimes" indicating variability or frequency of events
New Auto-Interp
Negative Logits
ắp
-0.21
idor
-0.17
illery
-0.17
rompt
-0.16
ạm
-0.16
ibbon
-0.15
elman
-0.15
ixo
-0.14
omat
-0.14
.ul
-0.14
POSITIVE LOGITS
even
0.18
erville
0.16
gon
0.16
ARA
0.16
even
0.16
ebb
0.15
/all
0.15
imm
0.15
Even
0.15
Goth
0.14
Activations Density 0.034%