INDEX
Explanations
mentions of societal concepts and structures
references to the concept of society and its impact on various issues
New Auto-Interp
Negative Logits
urations
-0.84
Pad
-0.76
ruction
-0.73
word
-0.70
orney
-0.67
ocular
-0.66
iverse
-0.66
orescence
-0.65
rav
-0.64
etsk
-0.63
POSITIVE LOGITS
wide
1.04
folk
0.81
liness
0.80
ically
0.74
eers
0.71
fare
0.70
indo
0.70
geist
0.68
deems
0.66
evolves
0.66
Activations Density 0.024%