INDEX
Explanations
sentences that express a strong opinion or critique
New Auto-Interp
Negative Logits
-0.54
<bos>
-0.53
$
-0.49
orn
-0.47
inska
-0.47
gal
-0.47
pal
-0.47
,
-0.46
c
-0.45
(
-0.45
POSITIVE LOGITS
виправивши
1.14
disambiguazione
0.96
дописавши
0.86
ScopeManager
0.84
OFDb
0.81
surla
0.81
Personensuche
0.78
localctx
0.75
مرئيه
0.75
bezeichneter
0.74
Activations Density 0.013%