INDEX
Explanations
phrases related to exclusion or limitation
conjunctions and phrases that indicate conditions or qualifications
New Auto-Interp
Negative Logits
oward
-0.88
quished
-0.81
ukong
-0.70
ivating
-0.65
successfully
-0.63
=#
-0.63
earthqu
-0.63
ãĤ¼ãĤ¦ãĤ¹
-0.62
hur
-0.62
ilaterally
-0.61
POSITIVE LOGITS
nothing
0.99
meaningless
0.97
ignores
0.93
insignificant
0.92
nothing
0.90
negligible
0.88
none
0.87
nowhere
0.87
nobody
0.85
worthless
0.85
Activations Density 0.524%