INDEX
Explanations
mentions of specific individuals and political events
New Auto-Interp
Negative Logits
inherit
-0.65
orno
-0.63
currently
-0.60
pmwiki
-0.59
ombies
-0.59
Frankenstein
-0.58
/"
-0.58
afety
-0.58
WATCHED
-0.57
é¾įå¥ij士
-0.57
POSITIVE LOGITS
cheon
0.83
than
0.72
itudes
0.71
itud
0.70
itiz
0.70
isode
0.68
imester
0.67
versely
0.66
naissance
0.65
elight
0.65
Activations Density 0.047%