INDEX
Explanations
conjunctions and transitional phrases that indicate causality or reasoning
New Auto-Interp
Negative Logits
الحره
-0.74
myſelf
-0.73
itſelf
-0.73
expandindo
-0.72
ſeveral
-0.72
الدولى
-0.71
himſelf
-0.71
ſelf
-0.71
leſs
-0.69
ſelves
-0.69
POSITIVE LOGITS
they
0.87
it
0.79
there
0.78
since
0.77
since
0.76
because
0.75
because
0.70
although
0.69
畢竟
0.69
we
0.66
Activations Density 0.247%