INDEX
Explanations
phrases indicating communication or responses from other parties
ongoing communication and feedback related to actions or responses
New Auto-Interp
Negative Logits
mint
-0.69
Melt
-0.63
weather
-0.61
cession
-0.60
TAMADRA
-0.59
Kills
-0.57
inent
-0.57
masse
-0.57
rament
-0.57
resses
-0.55
POSITIVE LOGITS
back
2.20
back
1.95
backs
1.83
BACK
1.79
Back
1.79
Back
1.72
backs
1.70
BACK
1.64
backward
1.00
ago
0.99
Activations Density 0.474%