INDEX
Explanations
statements related to accountability in law enforcement
content related to existential or philosophical questions
New Auto-Interp
Negative Logits
".[
-0.50
seldom
-0.48
alike
-0.46
catentry
-0.45
Avg
-0.45
rarely
-0.45
cffffcc
-0.44
throb
-0.44
bryce
-0.44
clamation
-0.43
POSITIVE LOGITS
anymore
0.70
anytime
0.65
someday
0.59
somew
0.57
Spoiler
0.56
itors
0.56
somehow
0.54
?????
0.53
matically
0.52
Enough
0.51
Activations Density 1.194%