INDEX
Explanations
numerical thresholds or minimum requirements
phrases indicating minimum requirements or thresholds
New Auto-Interp
Negative Logits
DRAG
-0.76
Dynamics
-0.73
Reviewer
-0.73
bath
-0.72
ãĥ¼ãĤ¯
-0.64
rend
-0.64
Brawl
-0.64
selves
-0.63
kai
-0.61
tions
-0.60
POSITIVE LOGITS
uner
0.77
credible
0.76
partially
0.73
toler
0.71
reputable
0.68
plausible
0.67
satisfactory
0.66
egal
0.66
partly
0.66
viable
0.66
Activations Density 0.021%