INDEX
Explanations
political or controversial terms and concepts
words related to deception or falsehoods
New Auto-Interp
Negative Logits
Wein
-0.47
Naples
-0.46
BILITIES
-0.45
udeb
-0.44
Wak
-0.44
CLSID
-0.43
Santos
-0.42
SAN
-0.42
Gunn
-0.42
¿½
-0.42
POSITIVE LOGITS
tainment
0.68
aneous
0.65
"],
0.54
etooth
0.53
")
0.53
>)
0.53
osate
0.52
itual
0.52
nown
0.52
iotic
0.50
Activations Density 0.622%