INDEX
Explanations
conjunctions and transitional phrases that indicate the reasoning or conclusion in a text
New Auto-Interp
Negative Logits
ly
-0.87
NameInMap
-0.76
Infórmanos
-0.75
🔥🔥
-0.75
性
-0.72
—
-0.70
ので
-0.69
erweise
-0.69
halb
-0.68
———
-0.65
POSITIVE LOGITS
––––
1.32
1.04
1.02
ര്
0.99
–
0.97
്
0.95
ţi
0.95
ায়
0.94
––
0.93
ţilor
0.93
Activations Density 0.499%