INDEX
Explanations
references to supernatural or mythical creatures in tense situations
New Auto-Interp
Negative Logits
Sponge
-0.16
cust
-0.15
Birds
-0.15
Hamm
-0.15
vant
-0.15
üz
-0.15
viso
-0.14
wart
-0.14
egg
-0.14
acts
-0.14
POSITIVE LOGITS
wer
0.37
Wer
0.36
Shift
0.36
Shift
0.35
shift
0.35
Wer
0.34
shift
0.34
wol
0.33
-shift
0.32
shifted
0.31
Activations Density 0.106%