INDEX
Explanations
phrases related to contrasting ideas
phrases related to societal and systemic issues
New Auto-Interp
Negative Logits
00007
-0.72
awoken
-0.70
cellaneous
-0.67
aspx
-0.66
hess
-0.65
)))
-0.65
Eva
-0.64
Nanto
-0.63
luck
-0.61
aido
-0.60
POSITIVE LOGITS
nor
1.10
anymore
1.02
necessarily
0.98
merely
0.69
purely
0.68
erker
0.67
partisans
0.67
altru
0.64
partisan
0.64
merits
0.63
Activations Density 0.938%