INDEX
Explanations
clauses that express assertions or claims about a subject
New Auto-Interp
Negative Logits
NameInMap
-0.73
िखित
-0.67
VES
-0.67
Josephus
-0.66
fufficient
-0.65
Viter
-0.65
Jacobi
-0.65
talkin
-0.64
Dior
-0.64
Mao
-0.64
POSITIVE LOGITS
that
1.17
the
0.96
that
0.95
there
0.94
assertThat
0.93
bahwa
0.93
bahawa
0.92
it
0.92
they
0.89
THAT
0.85
Activations Density 0.368%