INDEX
Explanations
words related to unexpected or surprising things
instances of the word "surprisingly" in various contexts
New Auto-Interp
Negative Logits
gang
-0.75
flight
-0.72
tein
-0.70
icipated
-0.69
yi
-0.69
glas
-0.66
escription
-0.66
icip
-0.65
amins
-0.65
ère
-0.63
POSITIVE LOGITS
beit
0.75
absent
0.75
Effective
0.69
enough
0.66
overpowered
0.66
LIMITED
0.66
STEM
0.66
surprising
0.64
atility
0.64
readable
0.64
Activations Density 0.022%