INDEX
Explanations
verbs suggesting action or engagement
New Auto-Interp
Negative Logits
Wikimedijinoj
-0.77
^(@)
-0.68
"..\..\..\
-0.67
^(@
-0.66
MATIC
-0.63
CURIAM
-0.63
$_"
-0.62
felves
-0.62
@}
-0.62
ORIES
-0.61
POSITIVE LOGITS
the
1.57
the
0.61
barnen
0.61
BufferException
0.59
själva
0.58
igjen
0.58
<bos>
0.54
presentazione
0.53
kvinna
0.53
sitten
0.53
Activations Density 1.307%