INDEX
Explanations
instances of negation or denial in the text
New Auto-Interp
Negative Logits
osgi
-0.47
TestBed
-0.46
umlu
-0.46
unhofer
-0.46
ExecuteAsync
-0.45
protetor
-0.45
Espèce
-0.44
+#+
-0.44
mengen
-0.43
HCM
-0.43
POSITIVE LOGITS
1.04
0.94
subreddit
0.93
0.92
r
0.90
subreddits
0.87
0.87
reddits
0.83
redd
0.82
redd
0.79
Activations Density 0.374%