INDEX
Explanations
phrases indicating support or corroboration
phrases related to supporting claims with evidence
New Auto-Interp
Negative Logits
OY
-0.82
eteenth
-0.76
avery
-0.75
jee
-0.74
sonian
-0.71
ague
-0.69
acht
-0.67
omal
-0.66
ities
-0.65
izo
-0.64
POSITIVE LOGITS
against
0.78
shaky
0.76
senal
0.71
Tes
0.68
Against
0.68
Rept
0.67
powering
0.67
packs
0.67
stretched
0.66
Arkham
0.65
Activations Density 0.089%