INDEX
Explanations
the word "Therefore" followed by a comma
conjunctions and introductory phrases indicating cause or reason
New Auto-Interp
Negative Logits
Rumble
-0.69
Bucks
-0.66
nurs
-0.59
Aboriginal
-0.58
Defenders
-0.56
Feld
-0.56
DL
-0.55
stro
-0.55
BM
-0.55
metro
-0.54
POSITIVE LOGITS
forth
1.27
xual
0.87
ommel
0.84
entimes
0.79
forward
0.78
tainment
0.76
fter
0.76
arose
0.75
manuel
0.74
atto
0.74
Activations Density 0.033%