INDEX
Explanations
phrases indicating instructions or guidelines
phrases indicating relationships or contexts involving organizations or systems
New Auto-Interp
Negative Logits
ItemTracker
-0.81
oret
-0.71
po
-0.67
bot
-0.67
WARN
-0.66
PsyNetMessage
-0.66
proof
-0.65
lopp
-0.65
blind
-0.65
wav
-0.65
POSITIVE LOGITS
respectively
1.66
alike
1.42
whichever
0.98
respective
0.85
depending
0.82
various
0.76
varied
0.72
diverse
0.71
both
0.68
each
0.64
Activations Density 0.681%