INDEX
Explanations
though, but, or categorized
New Auto-Interp
Negative Logits
takže
0.68
impresionante
0.65
folosit
0.62
utilisé
0.61
utilisée
0.61
menyebabkan
0.61
導致
0.60
ktoré
0.60
違う
0.60
ทำให้
0.59
POSITIVE LOGITS
especially
1.05
especially
0.90
particularly
0.90
even
0.87
albeit
0.87
insofar
0.87
although
0.86
even
0.84
albeit
0.81
Especially
0.78
Activations Density 0.339%