INDEX
Explanations
phrases indicating uncertainty or conflict in opinions
New Auto-Interp
Negative Logits
gage
-0.50
Pool
-0.48
Hut
-0.47
Gap
-0.47
Mecca
-0.47
skinned
-0.46
uration
-0.46
Zac
-0.45
gat
-0.45
queues
-0.45
POSITIVE LOGITS
thereafter
1.02
thereto
0.75
thereof
0.74
afterward
0.72
etheless
0.71
sequent
0.71
subsequent
0.70
alike
0.70
catentry
0.69
myself
0.67
Activations Density 0.492%