INDEX
Explanations
phrases related to deception and lies
phrases related to deception and criminal activities
New Auto-Interp
Negative Logits
ggles
-0.68
cour
-0.58
cellaneous
-0.58
Adds
-0.55
partName
-0.55
algia
-0.55
allas
-0.54
often
-0.54
Adds
-0.52
ofi
-0.52
POSITIVE LOGITS
beforehand
0.76
wrong
0.73
Voldemort
0.73
Bulgar
0.70
Malfoy
0.67
sooner
0.65
unlawfully
0.64
oldemort
0.63
anyway
0.62
illegally
0.62
Activations Density 1.847%