INDEX
Explanations
phrases or sentences describing situations or events
instances of the phrase "what happens" in various contexts
New Auto-Interp
Negative Logits
severe
-0.74
ondo
-0.72
shaw
-0.72
stra
-0.71
chet
-0.69
ilts
-0.68
rum
-0.68
serv
-0.67
rollers
-0.67
ikk
-0.67
POSITIVE LOGITS
uate
0.78
ometimes
0.77
Ambro
0.75
uating
0.74
Ñĭ
0.70
uates
0.70
aea
0.69
=================================
0.68
rences
0.66
unavoid
0.65
Activations Density 0.030%