INDEX
Explanations
verbs related to giving instructions or making suggestions
actions that involve formal recognition, implementation, or change related to societal or legal issues
New Auto-Interp
Negative Logits
ajo
-0.70
aleb
-0.58
outh
-0.56
absorption
-0.56
nav
-0.55
gotta
-0.55
inability
-0.55
Ars
-0.55
Phi
-0.54
Ser
-0.54
POSITIVE LOGITS
omorphic
0.82
quished
0.81
ASAP
0.75
sooner
0.74
âĶľâĶĢâĶĢ
0.74
by
0.73
alive
0.73
anew
0.73
someday
0.72
ynamic
0.70
Activations Density 0.234%