INDEX
Explanations
verbs related to various forms of assessment or judgment
verbs and actions related to questioning, arguing, and exploring concepts
New Auto-Interp
Negative Logits
ajo
-0.60
ugu
-0.55
ggles
-0.53
Sierra
-0.52
rouse
-0.51
rise
-0.50
flu
-0.49
Us
-0.49
dos
-0.49
dom
-0.49
POSITIVE LOGITS
omorphic
0.68
aback
0.64
urated
0.63
psychiat
0.62
DragonMagazine
0.59
entious
0.57
IRED
0.57
uzz
0.56
hostage
0.56
ivating
0.56
Activations Density 0.727%