INDEX
Explanations
affirmations or expressions of positivity
New Auto-Interp
Negative Logits
multer
-0.68
Cura
-0.67
multer
-0.67
<tr>
-0.66
pilar
-0.65
#+#
-0.65
Chatham
-0.65
Limburg
-0.64
Brim
-0.64
ZE
-0.64
POSITIVE LOGITS
actually
1.84
actually
1.67
Actually
1.65
Actually
1.62
ACTUALLY
1.40
faktiskt
1.08
faktisk
1.07
sebenarnya
1.02
fact
0.96
tatsächlich
0.95
Activations Density 0.077%