INDEX
Explanations
phrases indicating denial or negation
New Auto-Interp
Negative Logits
Италијани
-0.52
cupine
-0.46
BorderFactory
-0.45
WithIOException
-0.44
Попис
-0.42
SubMenu
-0.41
Kombat
-0.40
getResources
-0.39
ridurre
-0.38
избежать
-0.38
POSITIVE LOGITS
ब्रेकडाउन
0.43
<?
0.41
(!__
0.39
Missing
0.39
missing
0.39
SequentialGroup
0.39
Missing
0.38
#
0.38
Ự
0.37
0.36
Activations Density 0.960%