INDEX
Explanations
adverbs that modify actions or states
New Auto-Interp
Negative Logits
_=
-0.36
bezpečnost
-0.35
OMITBAD
-0.34
zajíma
-0.33
zimní
-0.33
scandalous
-0.32
ValueStyle
-0.32
pří
-0.31
stateMutability
-0.31
(()=>
-0.31
POSITIVE LOGITS
had
0.81
have
0.73
didn
0.65
got
0.65
gave
0.62
drew
0.61
did
0.60
Оно
0.58
looked
0.58
LabelTagHelper
0.58
Activations Density 0.309%