INDEX
Explanations
phrases indicating unprecedented circumstances or experiences
New Auto-Interp
Negative Logits
γκ
-0.15
pus
-0.15
ocker
-0.14
hlen
-0.14
Leap
-0.14
ripp
-0.14
Leban
-0.14
adÃŃ
-0.14
mine
-0.13
dept
-0.13
POSITIVE LOGITS
ever
0.26
previous
0.24
previously
0.23
-ever
0.23
previous
0.21
.previous
0.20
Previous
0.19
EVER
0.18
ever
0.18
Previously
0.18
Activations Density 0.036%