INDEX
Explanations
expressions of opinion or assertion
New Auto-Interp
Negative Logits
icari
-0.15
ham
-0.15
encent
-0.15
Monte
-0.14
vari
-0.14
encer
-0.14
rech
-0.14
oen
-0.14
tant
-0.14
mont
-0.14
POSITIVE LOGITS
sole
0.17
pus
0.16
tober
0.15
ombo
0.14
/general
0.14
Dut
0.14
MESS
0.13
ÏĦί
0.13
omic
0.13
stitute
0.13
Activations Density 0.119%