INDEX
Explanations
adjectives indicating strong beliefs or support for a particular cause
words that indicate strong and unwavering support or opposition
New Auto-Interp
Negative Logits
ammy
-0.99
hops
-0.79
hazard
-0.73
nesota
-0.71
ysc
-0.69
ovember
-0.68
adish
-0.68
ammers
-0.66
Takeru
-0.65
pta
-0.65
POSITIVE LOGITS
ly
0.95
supporter
0.81
itive
0.81
ELY
0.77
adherent
0.75
è¦
0.71
loyal
0.71
pacif
0.70
ãĥ¼ãĤ¯
0.70
proponent
0.70
Activations Density 0.021%