INDEX
Explanations
concepts related to conditional phrases and causal relationships
New Auto-Interp
Negative Logits
myſelf
-0.74
ſeveral
-0.72
ſelves
-0.69
éroport
-0.67
himſelf
-0.67
Theſe
-0.66
NewLabel
-0.66
itſelf
-0.65
الدولى
-0.65
DoubleQuotes
-0.64
POSITIVE LOGITS
because
1.04
because
0.96
Sebab
0.92
Because
0.89
Because
0.87
perché
0.85
Ведь
0.84
畢竟
0.83
BECAUSE
0.83
perchè
0.82
Activations Density 0.303%