INDEX
Explanations
references to teaching roles and activities
New Auto-Interp
Negative Logits
omo
-0.16
zung
-0.15
bara
-0.15
Cork
-0.14
atta
-0.14
uet
-0.14
iser
-0.14
uck
-0.14
gu
-0.14
otta
-0.14
POSITIVE LOGITS
ovah
0.18
IGNAL
0.18
ardown
0.17
erman
0.17
ÑĤаж
0.17
inch
0.17
éŀ
0.15
ÏĥÏĥ
0.15
ember
0.14
ridged
0.14
Activations Density 0.053%