INDEX
Explanations
mentions of specific events or situations
New Auto-Interp
Negative Logits
orthy
-0.98
izons
-0.86
isons
-0.85
erate
-0.85
iar
-0.81
ield
-0.81
visor
-0.80
awi
-0.79
eri
-0.78
igible
-0.77
POSITIVE LOGITS
pesky
0.85
infamous
0.82
famous
0.78
legendary
0.78
acron
0.77
commercials
0.76
dreaded
0.74
roller
0.71
last
0.71
LAST
0.70
Activations Density 0.175%