INDEX
Explanations
phrases related to understanding or summarizing text
expressions of personal understanding and involvement in discussions
New Auto-Interp
Negative Logits
Crazy
-0.80
poop
-0.78
KILL
-0.77
ðŁĺ
-0.73
Kids
-0.73
Chains
-0.70
diaper
-0.70
Kardash
-0.69
fake
-0.69
underwear
-0.68
POSITIVE LOGITS
scholarly
1.28
methodological
1.19
fruitful
1.09
eluc
1.06
summar
1.01
histor
1.00
scholars
0.99
anthology
0.98
scholar
0.97
indispensable
0.95
Activations Density 0.772%