INDEX
Explanations
phrases indicating contradiction or contrast
New Auto-Interp
Negative Logits
entweder
-0.52
newArrayList
-0.49
.
-0.48
sekaligus
-0.47
eltjes
-0.46
extAlignment
-0.46
porque
-0.46
porque
-0.46
WriteAttribute
-0.45
".
-0.44
POSITIVE LOGITS
this
0.88
all
0.87
recent
0.85
everything
0.84
propOrder
0.81
these
0.80
today
0.77
hindsight
0.77
مشين
0.75
such
0.75
Activations Density 0.380%