INDEX
Explanations
emotional terms related to communication and expression
instances of punctuation indicating the end of a textual segment
New Auto-Interp
Negative Logits
urga
-0.76
atis
-0.71
ucci
-0.68
itone
-0.66
script
-0.64
isc
-0.64
paio
-0.63
rete
-0.63
manager
-0.62
patch
-0.62
POSITIVE LOGITS
Known
0.93
Alive
0.87
entimes
0.77
selves
0.72
INGS
0.71
learnt
0.71
Seen
0.70
IER
0.69
Been
0.68
Wrong
0.66
Activations Density 0.113%