INDEX
Explanations
phrases indicating leadership or guidance
New Auto-Interp
Negative Logits
alem
-0.18
_NR
-0.15
iglia
-0.15
егоÑĢ
-0.15
erule
-0.15
STYPE
-0.15
Pid
-0.15
aÄį
-0.15
.syn
-0.15
ErrorException
-0.14
POSITIVE LOGITS
ge
0.16
lead
0.15
ite
0.15
980
0.14
patron
0.14
Moss
0.14
Lead
0.14
Batt
0.14
def
0.14
uru
0.14
Activations Density 0.038%