INDEX
Explanations
phrases indicating negation or resistance
New Auto-Interp
Negative Logits
InputDecoration
-0.55
whatever
-0.50
DockStyle
-0.47
HtmlAttribute
-0.47
AttributeError
-0.47
maybe
-0.46
whatever
-0.45
WireFormat
-0.45
hyrchwyd
-0.45
eventually
-0.44
POSITIVE LOGITS
pexpr
0.45
autorytatywna
0.42
lisäksi
0.40
withIdentifier
0.40
transfieras
0.40
waitKey
0.39
hanem
0.38
بلکه
0.38
cherchés
0.38
farlo
0.37
Activations Density 0.021%