INDEX
Explanations
information related to historical events, political controversies, and cultural references
New Auto-Interp
Negative Logits
Dragonbound
-0.72
ogy
-0.70
Opportun
-0.69
antha
-0.65
hower
-0.63
gorilla
-0.61
Syl
-0.60
ãĥ¼ãĥĨ
-0.59
achus
-0.58
ipation
-0.58
POSITIVE LOGITS
11
1.06
09
1.03
16
1.02
22
1.02
02
1.01
12
1.01
13
1.01
10
1.01
31
0.99
08
0.99
Activations Density 0.336%