INDEX
Explanations
phrases related to significant or impactful events
phrases indicating surprising or unexpected events
New Auto-Interp
Negative Logits
izons
-0.95
azines
-0.72
omever
-0.71
azine
-0.70
onents
-0.69
alions
-0.66
omorph
-0.65
ophers
-0.64
buttons
-0.62
û
-0.62
POSITIVE LOGITS
reversal
1.00
effort
0.88
nutshell
0.88
bombshell
0.87
bid
0.86
concession
0.84
attempt
0.80
stakes
0.80
testament
0.79
Synopsis
0.78
Activations Density 0.224%