INDEX
Explanations
verbs related to offering guidance or advice
concepts related to guidance and influence
New Auto-Interp
Negative Logits
uters
-0.92
olit
-0.74
unes
-0.72
ãĥ¼ãĥ³
-0.71
icz
-0.70
alm
-0.69
jab
-0.69
alian
-0.68
ramid
-0.67
zbollah
-0.67
POSITIVE LOGITS
perceptions
0.94
decisions
0.85
speculation
0.83
future
0.83
subsequent
0.81
everything
0.79
comprehension
0.78
our
0.78
us
0.77
discussions
0.76
Activations Density 0.202%