INDEX
Explanations
mentions of "RS" or acronyms containing "RS" with a high activation value
references to rankings or scores associated with events or entities
New Auto-Interp
Negative Logits
villain
-0.63
Falling
-0.61
nan
-0.59
dove
-0.58
plum
-0.58
melting
-0.57
Uzbek
-0.57
Babylon
-0.57
Elaine
-0.57
pill
-0.57
POSITIVE LOGITS
RS
4.42
RS
1.93
RL
1.61
RC
1.60
rs
1.50
FS
1.39
RR
1.38
DS
1.37
RD
1.34
WS
1.34
Activations Density 0.013%