INDEX
Explanations
phrases indicating past events or actions
the word "previously" and its variations, indicating references to prior events or actions
New Auto-Interp
Negative Logits
ocracy
-0.68
letico
-0.68
alion
-0.67
eer
-0.66
roller
-0.66
uri
-0.65
ueller
-0.65
ritch
-0.64
pling
-0.64
Baby
-0.62
POSITIVE LOGITS
unsus
1.01
unpublished
0.84
held
0.83
incarcerated
0.82
disclosed
0.81
existed
0.81
encountered
0.81
undisclosed
0.80
discussed
0.79
belonged
0.78
Activations Density 0.039%