INDEX
Explanations
phrases related to political organizations and controversial topics
New Auto-Interp
Negative Logits
tales
-0.87
frustrations
-0.78
scenes
-0.76
anecdotes
-0.73
ebus
-0.71
juxtap
-0.70
ramids
-0.69
whispers
-0.69
leaps
-0.66
reminis
-0.66
POSITIVE LOGITS
eligible
1.09
ineligible
1.08
eligible
1.03
unfit
1.02
punishable
0.95
angered
0.95
exempt
0.93
ocide
0.86
unlawful
0.86
Category
0.84
Activations Density 0.267%