INDEX
Explanations
phrases or terms indicating no change or lack of change
phrases indicating a lack of change or consistency
New Auto-Interp
Negative Logits
ardi
-0.69
Gate
-0.66
lan
-0.63
ast
-0.58
weather
-0.58
alla
-0.58
Muse
-0.57
gee
-0.57
Syndrome
-0.57
io
-0.56
POSITIVE LOGITS
unchanged
3.70
unaffected
2.12
untouched
2.05
intact
1.70
unch
1.26
identical
1.18
unim
1.17
Unch
1.13
uninterrupted
1.13
unrem
1.12
Activations Density 0.016%