INDEX
Explanations
emotional expressions or feelings in the text
New Auto-Interp
Negative Logits
centrif
-0.73
interchange
-0.66
shocks
-0.66
Dragonbound
-0.63
loophole
-0.61
penetration
-0.60
diffusion
-0.60
displacement
-0.60
exposure
-0.59
Colossus
-0.58
POSITIVE LOGITS
Ĩ
1.40
¹
1.38
Į
1.36
İ
1.36
ĺ
1.31
¿
1.31
ī
1.29
ĥ
1.29
»
1.28
«
1.28
Activations Density 0.009%