INDEX
Explanations
references to causation and impact
New Auto-Interp
Negative Logits
Efq
-0.87
ніципалі
-0.78
Portale
-0.72
himſelf
-0.72
ſaid
-0.72
ſelf
-0.72
myſelf
-0.71
hatched
-0.69
paſſ
-0.69
তথ্যসূত্র
-0.69
POSITIVE LOGITS
因
0.77
akibat
0.77
karena
0.77
karena
0.77
Karena
0.75
because
0.72
بسبب
0.71
devido
0.71
是因为
0.69
Because
0.69
Activations Density 0.136%