INDEX
Explanations
phrases or words that suggest a physical force or impact
New Auto-Interp
Negative Logits
awaru
-0.68
Palestin
-0.67
EMENT
-0.67
BILITY
-0.67
Democr
-0.67
TAIN
-0.63
GMT
-0.61
BDS
-0.60
VIDEOS
-0.60
Bethlehem
-0.59
POSITIVE LOGITS
ashing
1.19
ogged
1.17
ojure
1.17
ipper
1.14
avier
1.12
amped
1.10
amps
1.07
umps
1.06
unky
1.06
ique
1.04
Activations Density 0.009%