INDEX
Explanations
actions related to communication and interaction
New Auto-Interp
Negative Logits
geworden
-0.50
گز
-0.49
to
-0.47
sepeda
-0.46
diketahui
-0.43
gantung
-0.43
maksud
-0.42
Es
-0.42
.
-0.42
käs
-0.42
POSITIVE LOGITS
nahilalakip
0.88
HtmlAttribute
0.86
tartalomajánló
0.83
contentLoaded
0.77
Theſe
0.76
myſelf
0.76
rawDesc
0.74
―――――
0.72
InjectAttribute
0.71
ProtoMessage
0.71
Activations Density 0.101%