INDEX
Explanations
instances of emotional impact and experiences described in narratives
New Auto-Interp
Negative Logits
unda
-0.16
Out
-0.16
Rim
-0.15
Out
-0.15
Up
-0.14
ÑģÑĤÑĢи
-0.14
aign
-0.14
BaseModel
-0.13
_guide
-0.13
haft
-0.13
POSITIVE LOGITS
over
0.85
over
0.58
-over
0.57
över
0.52
ov
0.52
OVER
0.52
_over
0.51
sobre
0.50
.over
0.50
Over
0.43
Activations Density 0.123%