INDEX
Explanations
specific terms and phrases that imply negativity or criticism
gadgets, thingies, and wacky terms
New Auto-Interp
Negative Logits
bkz
-0.43
kasarigan
-0.43
Referințe
-0.39
droje
-0.37
Stellung
-0.37
ikbaar
-0.35
referrerpolicy
-0.34
"..\..\
-0.33
فريبيس
-0.33
fordern
-0.33
POSITIVE LOGITS
Réponses
0.68
giz
0.68
thingy
0.64
wacky
0.63
gadget
0.60
Baz
0.60
gadgets
0.59
shenanigans
0.58
widgets
0.57
Giz
0.57
Activations Density 0.047%