INDEX
Explanations
mentions of fake or simulated content, especially in relation to data, signals, or news
New Auto-Interp
Negative Logits
xual
-1.32
pins
-1.11
hens
-1.11
ands
-1.10
hani
-1.09
ishops
-1.08
arching
-1.05
guiActiveUnfocused
-1.02
vez
-1.00
bard
-1.00
POSITIVE LOGITS
pas
1.37
ument
1.10
tan
1.08
ulously
1.03
IDs
1.01
ulence
1.00
ãĤ§
1.00
²¾
0.99
aneous
0.98
izable
0.97
Activations Density 1.442%