INDEX
Explanations
instances of academic study and education-related terms
New Auto-Interp
Negative Logits
ãģĿãĤĮãģ¯
-0.15
&W
-0.14
iry
-0.14
cdf
-0.14
isten
-0.14
stalk
-0.14
_closure
-0.14
Coverage
-0.13
igham
-0.13
oir
-0.13
POSITIVE LOGITS
abroad
0.20
medicine
0.19
vala
0.17
law
0.16
intens
0.15
ymology
0.15
ring
0.14
æ³ķ
0.14
мага
0.14
INATION
0.14
Activations Density 0.037%