INDEX
Explanations
references to reactions or replies
mentions of varying responses or reactions
New Auto-Interp
Negative Logits
rome
-0.81
cutting
-0.72
ramer
-0.69
ffe
-0.67
hemat
-0.67
rafted
-0.64
rust
-0.63
rip
-0.62
raft
-0.62
roots
-0.62
POSITIVE LOGITS
thereto
0.97
response
0.94
responses
0.90
reaction
0.82
ivation
0.79
ively
0.79
naires
0.78
DragonMagazine
0.77
response
0.77
reactions
0.77
Activations Density 0.031%