INDEX
Explanations
occurrences of specific nouns or themes related to education and institutions
New Auto-Interp
Negative Logits
hs
-0.16
cid
-0.16
odzi
-0.16
ov
-0.15
cs
-0.15
им
-0.15
ope
-0.15
ually
-0.14
embed
-0.14
um
-0.14
POSITIVE LOGITS
lage
0.18
cales
0.18
illon
0.16
acco
0.15
nel
0.15
ิà¹Ī
0.15
ÄĮeská
0.14
lotte
0.14
rad
0.14
ourd
0.14
Activations Density 0.520%