INDEX
Explanations
sentences or clauses starting with "there is/are"
New Auto-Interp
Negative Logits
UnusedPrivate
-0.88
Jefus
-0.86
Efq
-0.77
Houſe
-0.77
Majefty
-0.76
Anſ
-0.75
uſe
-0.74
Билгалдахарш
-0.74
houſe
-0.73
وتسجيلات
-0.71
POSITIVE LOGITS
lies
0.67
lie
0.55
<bos>
0.54
most
0.51
comes
0.49
rests
0.48
égard
0.45
really
0.45
is
0.44
0.44
Activations Density 0.313%