INDEX
Explanations
words related to impactful events or occurrences that shake a system or community
events or situations that cause significant disruption or upheaval
New Auto-Interp
Negative Logits
abet
-0.79
ittees
-0.77
orescent
-0.70
uses
-0.70
idates
-0.67
ript
-0.65
uded
-0.65
ately
-0.64
odore
-0.63
endi
-0.63
POSITIVE LOGITS
rocked
1.04
rocking
0.98
stead
0.87
rock
0.79
Funk
0.75
castle
0.71
rock
0.70
neck
0.69
tones
0.69
crow
0.69
Activations Density 0.005%