INDEX
Explanations
phrases indicating likelihood or estimation about choices and preferences
New Auto-Interp
Negative Logits
!
-0.44
있어
-0.44
anto
-0.44
ваются
-0.43
-0.43
DA
-0.42
da
-0.40
by
-0.40
mal
-0.40
da
-0.39
POSITIVE LOGITS
oprot
1.00
likely
0.99
RectangleBorder
0.98
Likely
0.97
outState
0.96
likely
0.93
Likely
0.93
EndInit
0.92
GEBURTSDATUM
0.90
ContentAsync
0.89
Activations Density 0.288%