INDEX
Explanations
names of individuals in text
mentions of specific individuals, likely authority figures or experts, in the context of providing quotes or insights
New Auto-Interp
Negative Logits
revenge
-0.67
tumblr
-0.64
vigilante
-0.63
diaper
-0.62
KKK
-0.62
racist
-0.61
ĸļ
-0.59
reincarn
-0.59
youtube
-0.59
*/(
-0.58
POSITIVE LOGITS
endor
0.74
ansky
0.71
hani
0.71
lett
0.71
mann
0.71
rup
0.70
enda
0.69
owsky
0.69
elli
0.68
patrick
0.68
Activations Density 0.518%