INDEX
Explanations
statements expressing personal thoughts, feelings, or opinions
New Auto-Interp
Negative Logits
مرئيه
-0.71
Benth
-0.68
zeera
-0.63
UnusedPrivate
-0.63
udios
-0.61
betweenstory
-0.60
fromnode
-0.59
atika
-0.57
dedans
-0.56
Litu
-0.56
POSITIVE LOGITS
But
0.88
but
0.85
but
0.84
但她
0.81
pero
0.78
BUT
0.77
But
0.77
AssemblyVersion
0.76
BUT
0.74
definitely
0.72
Activations Density 0.190%