INDEX
Explanations
phrases indicating a certain type of emphasis or focus, typically highlighting a specific quality or characteristic
phrases expressing uncertainty or a sense of vagueness
New Auto-Interp
Negative Logits
VEL
-0.66
OPE
-0.66
Expansion
-0.64
VM
-0.62
Examination
-0.61
ULT
-0.60
PT
-0.59
Ultra
-0.58
idate
-0.57
ODE
-0.57
POSITIVE LOGITS
entimes
0.89
nered
0.88
ling
0.79
heartedly
0.77
erd
0.75
hearted
0.75
cast
0.74
led
0.74
lier
0.74
entially
0.71
Activations Density 0.037%