INDEX
Explanations
technical descriptions or classifications
instances where the word "technically" is used
New Auto-Interp
Negative Logits
ieties
-0.69
velt
-0.69
Cities
-0.67
Neighbor
-0.65
cot
-0.65
vic
-0.63
Hungry
-0.62
orf
-0.61
vation
-0.61
Vaj
-0.61
POSITIVE LOGITS
inclined
0.88
speaking
0.88
minded
0.81
correct
0.76
incorrect
0.75
proficient
0.74
ename
0.73
inaccurate
0.69
illiter
0.69
inept
0.67
Activations Density 0.021%