INDEX
Explanations
phrases indicating opposition or contradiction
conjunctions and phrases indicating agreement or collective action
New Auto-Interp
Negative Logits
bluff
-0.64
consolidation
-0.61
flats
-0.58
playback
-0.58
Survivors
-0.58
compression
-0.58
Virus
-0.58
Transformation
-0.57
elsen
-0.57
sterdam
-0.57
POSITIVE LOGITS
Ïī
0.78
sidx
0.75
itten
0.71
speaking
0.71
apon
0.67
enged
0.67
enge
0.65
agher
0.65
interested
0.64
achable
0.64
Activations Density 0.365%